* [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-11 14:51 ` Chih-Min Chao
2019-09-11 22:32 ` Richard Henderson
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 02/17] RISC-V: turn on vector extension from command line by cfg.ext_v Property liuzhiwei
` (16 subsequent siblings)
17 siblings, 2 replies; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/cpu.h | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 0adb307..c992b1d 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -93,9 +93,37 @@ typedef struct CPURISCVState CPURISCVState;
#include "pmp.h"
+#define VLEN 128
+#define VUNIT(x) (VLEN / x)
+
struct CPURISCVState {
target_ulong gpr[32];
uint64_t fpr[32]; /* assume both F and D extensions */
+
+ /* vector coprocessor state. */
+ struct {
+ union VECTOR {
+ float64 f64[VUNIT(64)];
+ float32 f32[VUNIT(32)];
+ float16 f16[VUNIT(16)];
+ uint64_t u64[VUNIT(64)];
+ int64_t s64[VUNIT(64)];
+ uint32_t u32[VUNIT(32)];
+ int32_t s32[VUNIT(32)];
+ uint16_t u16[VUNIT(16)];
+ int16_t s16[VUNIT(16)];
+ uint8_t u8[VUNIT(8)];
+ int8_t s8[VUNIT(8)];
+ } vreg[32];
+ target_ulong vxrm;
+ target_ulong vxsat;
+ target_ulong vl;
+ target_ulong vstart;
+ target_ulong vtype;
+ float_status fp_status;
+ } vfp;
+
+ bool foflag;
target_ulong pc;
target_ulong load_res;
target_ulong load_val;
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState liuzhiwei
@ 2019-09-11 14:51 ` Chih-Min Chao
2019-09-11 22:39 ` Richard Henderson
2019-09-17 8:09 ` liuzhiwei
2019-09-11 22:32 ` Richard Henderson
1 sibling, 2 replies; 43+ messages in thread
From: Chih-Min Chao @ 2019-09-11 14:51 UTC (permalink / raw)
To: liuzhiwei
Cc: Palmer Dabbelt, open list:RISC-V, Sagar Karandikar,
Bastian Koppelmann, riku.voipio, laurent, wxy194768,
qemu-devel@nongnu.org Developers, wenmeng_zhang,
Alistair Francis
On Wed, Sep 11, 2019 at 2:35 PM liuzhiwei <zhiwei_liu@c-sky.com> wrote:
> From: LIU Zhiwei <zhiwei_liu@c-sky.com>
>
> Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
> ---
> target/riscv/cpu.h | 28 ++++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 0adb307..c992b1d 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -93,9 +93,37 @@ typedef struct CPURISCVState CPURISCVState;
>
> #include "pmp.h"
>
> +#define VLEN 128
> +#define VUNIT(x) (VLEN / x)
> +
> struct CPURISCVState {
> target_ulong gpr[32];
> uint64_t fpr[32]; /* assume both F and D extensions */
> +
> + /* vector coprocessor state. */
> + struct {
> + union VECTOR {
> + float64 f64[VUNIT(64)];
> + float32 f32[VUNIT(32)];
> + float16 f16[VUNIT(16)];
> + uint64_t u64[VUNIT(64)];
> + int64_t s64[VUNIT(64)];
> + uint32_t u32[VUNIT(32)];
> + int32_t s32[VUNIT(32)];
> + uint16_t u16[VUNIT(16)];
> + int16_t s16[VUNIT(16)];
> + uint8_t u8[VUNIT(8)];
> + int8_t s8[VUNIT(8)];
> + } vreg[32];
> + target_ulong vxrm;
> + target_ulong vxsat;
> + target_ulong vl;
> + target_ulong vstart;
> + target_ulong vtype;
> + float_status fp_status;
> + } vfp;
> +
> + bool foflag;
> target_ulong pc;
> target_ulong load_res;
> target_ulong load_val;
> --
> 2.7.4
>
>
Could the VLEN be configurable in cpu initialization but not fixed in
compilation phase ?
Take the integer element as example and the difference should be the
stride of vfp.vreg[x] isn't continuous
struct {
union VECTOR {
uint64_t *u64;
uint16_t *u16;
uint8_t *u8;
} vreg[32];
} vfp;
initialization
int vlen = 256; //parameter from cpu command line option
int elem = vlen / 8;
int size = elem * 32;
uint8_t *mem = malloc(size)
for (int idx = 0; idx < 32; ++idx) {
vfp.vreg[idx].u64 = (void *)&mem[idx * elem];
vfp.vreg[idx].u32 = (void *)&mem[idx * elem];
vfp.vreg[idx].u16 = (void *)&mem[idx * elem];
}
chihmin
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState
2019-09-11 14:51 ` Chih-Min Chao
@ 2019-09-11 22:39 ` Richard Henderson
2019-09-12 14:53 ` Chih-Min Chao
2019-09-17 8:09 ` liuzhiwei
1 sibling, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-09-11 22:39 UTC (permalink / raw)
To: Chih-Min Chao, liuzhiwei
Cc: Palmer Dabbelt, open list:RISC-V, Sagar Karandikar,
Bastian Koppelmann, riku.voipio, laurent, wxy194768,
qemu-devel@nongnu.org Developers, wenmeng_zhang,
Alistair Francis
On 9/11/19 10:51 AM, Chih-Min Chao wrote:
> Could the VLEN be configurable in cpu initialization but not fixed in
> compilation phase ?
> Take the integer element as example and the difference should be the
> stride of vfp.vreg[x] isn't continuous
Do you really want an unbounded amount of vector register storage?
> uint8_t *mem = malloc(size)
> for (int idx = 0; idx < 32; ++idx) {
> vfp.vreg[idx].u64 = (void *)&mem[idx * elem];
> vfp.vreg[idx].u32 = (void *)&mem[idx * elem];
> vfp.vreg[idx].u16 = (void *)&mem[idx * elem];
> }
This isn't adjusting the stride of the elements. And in any case this would
have to be re-adjusted for every vsetvl.
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState
2019-09-11 22:39 ` Richard Henderson
@ 2019-09-12 14:53 ` Chih-Min Chao
2019-09-12 15:06 ` Richard Henderson
0 siblings, 1 reply; 43+ messages in thread
From: Chih-Min Chao @ 2019-09-12 14:53 UTC (permalink / raw)
To: Richard Henderson
Cc: Palmer Dabbelt, open list:RISC-V, Sagar Karandikar,
Bastian Koppelmann, riku.voipio, laurent, wxy194768,
qemu-devel@nongnu.org Developers, wenmeng_zhang,
Alistair Francis, liuzhiwei
On Thu, Sep 12, 2019 at 6:39 AM Richard Henderson <
richard.henderson@linaro.org> wrote:
> On 9/11/19 10:51 AM, Chih-Min Chao wrote:
> > Could the VLEN be configurable in cpu initialization but not fixed in
> > compilation phase ?
> > Take the integer element as example and the difference should be the
> > stride of vfp.vreg[x] isn't continuous
>
> Do you really want an unbounded amount of vector register storage?
Hi Richard,
VLEN is implementation-defined parameter and the only limitation on spec is
that it must be power of 2.
What I prefer is the value could be adjustable in runtime.
>
> > uint8_t *mem = malloc(size)
> > for (int idx = 0; idx < 32; ++idx) {
> > vfp.vreg[idx].u64 = (void *)&mem[idx * elem];
> > vfp.vreg[idx].u32 = (void *)&mem[idx * elem];
> > vfp.vreg[idx].u16 = (void *)&mem[idx * elem];
> > }
>
> This isn't adjusting the stride of the elements. And in any case this
> would
> have to be re-adjusted for every vsetvl.
>
> Not sure about the relation with vsetvl. Could you provide an example ?
Chih-Min
>
> r~
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState
2019-09-12 14:53 ` Chih-Min Chao
@ 2019-09-12 15:06 ` Richard Henderson
0 siblings, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2019-09-12 15:06 UTC (permalink / raw)
To: Chih-Min Chao
Cc: Palmer Dabbelt, open list:RISC-V, Sagar Karandikar,
Bastian Koppelmann, riku.voipio, laurent, wxy194768,
qemu-devel@nongnu.org Developers, wenmeng_zhang,
Alistair Francis, liuzhiwei
On 9/12/19 10:53 AM, Chih-Min Chao wrote:
>
>
> On Thu, Sep 12, 2019 at 6:39 AM Richard Henderson <richard.henderson@linaro.org
> <mailto:richard.henderson@linaro.org>> wrote:
>
> On 9/11/19 10:51 AM, Chih-Min Chao wrote:
> > Could the VLEN be configurable in cpu initialization but not fixed in
> > compilation phase ?
> > Take the integer element as example and the difference should be the
> > stride of vfp.vreg[x] isn't continuous
>
> Do you really want an unbounded amount of vector register storage?
>
>
> Hi Richard,
>
> VLEN is implementation-defined parameter and the only limitation on spec is
> that it must be power of 2.
> What I prefer is the value could be adjustable in runtime.
Ok, fine, I suppose. I'll let a risc-v maintainer opine on whether there
should be some sanity check on the bounds of VLEN. If you really do have an
unbounded vlen, you'll need to consider carefully how you want to manage migration.
> > uint8_t *mem = malloc(size)
> > for (int idx = 0; idx < 32; ++idx) {
> > vfp.vreg[idx].u64 = (void *)&mem[idx * elem];
> > vfp.vreg[idx].u32 = (void *)&mem[idx * elem];
> > vfp.vreg[idx].u16 = (void *)&mem[idx * elem];
> > }
>
> This isn't adjusting the stride of the elements. And in any case this would
> have to be re-adjusted for every vsetvl.
>
> Not sure about the relation with vsetvl. Could you provide an example ?
Well, I think it's merely a matter of there's no point having so many different
pointers into the block of memory that provides the backing storage. I've
asserted elsewhere in the thread that we shouldn't have an array of 32
"registers" anyway.
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState
2019-09-11 14:51 ` Chih-Min Chao
2019-09-11 22:39 ` Richard Henderson
@ 2019-09-17 8:09 ` liuzhiwei
1 sibling, 0 replies; 43+ messages in thread
From: liuzhiwei @ 2019-09-17 8:09 UTC (permalink / raw)
To: Chih-Min Chao
Cc: Palmer Dabbelt, open list:RISC-V, Sagar Karandikar,
Bastian Koppelmann, riku.voipio, laurent, wxy194768,
qemu-devel@nongnu.org Developers, wenmeng_zhang,
Alistair Francis
On 2019/9/11 下午10:51, Chih-Min Chao wrote:
>
>
> On Wed, Sep 11, 2019 at 2:35 PM liuzhiwei <zhiwei_liu@c-sky.com
> <mailto:zhiwei_liu@c-sky.com>> wrote:
>
> From: LIU Zhiwei <zhiwei_liu@c-sky.com <mailto:zhiwei_liu@c-sky.com>>
>
> Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com
> <mailto:zhiwei_liu@c-sky.com>>
> ---
> target/riscv/cpu.h | 28 ++++++++++++++++++++++++++++
> 1 file changed, 28 insertions(+)
>
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 0adb307..c992b1d 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -93,9 +93,37 @@ typedef struct CPURISCVState CPURISCVState;
>
> #include "pmp.h"
>
> +#define VLEN 128
> +#define VUNIT(x) (VLEN / x)
> +
> struct CPURISCVState {
> target_ulong gpr[32];
> uint64_t fpr[32]; /* assume both F and D extensions */
> +
> + /* vector coprocessor state. */
> + struct {
> + union VECTOR {
> + float64 f64[VUNIT(64)];
> + float32 f32[VUNIT(32)];
> + float16 f16[VUNIT(16)];
> + uint64_t u64[VUNIT(64)];
> + int64_t s64[VUNIT(64)];
> + uint32_t u32[VUNIT(32)];
> + int32_t s32[VUNIT(32)];
> + uint16_t u16[VUNIT(16)];
> + int16_t s16[VUNIT(16)];
> + uint8_t u8[VUNIT(8)];
> + int8_t s8[VUNIT(8)];
> + } vreg[32];
> + target_ulong vxrm;
> + target_ulong vxsat;
> + target_ulong vl;
> + target_ulong vstart;
> + target_ulong vtype;
> + float_status fp_status;
> + } vfp;
> +
> + bool foflag;
> target_ulong pc;
> target_ulong load_res;
> target_ulong load_val;
> --
> 2.7.4
>
>
> Could the VLEN be configurable in cpu initialization but not fixed in
> compilation phase ?
Yes, it's important that VLEN is configurable to support different
types of cpu.
> Take the integer element as example and the difference should be the
> stride of vfp.vreg[x] isn't continuous
>
> struct {
> union VECTOR {
> uint64_t *u64;
> uint16_t *u16;
> uint8_t *u8;
> } vreg[32];
> } vfp;
>
> initialization
> int vlen = 256; //parameter from cpu command line option
> int elem = vlen / 8;
> int size = elem * 32;
>
> uint8_t *mem = malloc(size)
> for (int idx = 0; idx < 32; ++idx) {
> vfp.vreg[idx].u64 = (void *)&mem[idx * elem];
> vfp.vreg[idx].u32 = (void *)&mem[idx * elem];
> vfp.vreg[idx].u16 = (void *)&mem[idx * elem];
> }
>
> chihmin
It's a good idea. I will accept it.
Thanks for review.
Zhiwei
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState liuzhiwei
2019-09-11 14:51 ` Chih-Min Chao
@ 2019-09-11 22:32 ` Richard Henderson
1 sibling, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2019-09-11 22:32 UTC (permalink / raw)
To: liuzhiwei, Alistair.Francis, palmer, sagark, kbastian,
riku.voipio, laurent, wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768
On 9/11/19 2:25 AM, liuzhiwei wrote:
> uint64_t fpr[32]; /* assume both F and D extensions */
> +
> + /* vector coprocessor state. */
> + struct {
> + union VECTOR {
> + float64 f64[VUNIT(64)];
> + float32 f32[VUNIT(32)];
> + float16 f16[VUNIT(16)];
> + uint64_t u64[VUNIT(64)];
> + int64_t s64[VUNIT(64)];
> + uint32_t u32[VUNIT(32)];
> + int32_t s32[VUNIT(32)];
> + uint16_t u16[VUNIT(16)];
> + int16_t s16[VUNIT(16)];
> + uint8_t u8[VUNIT(8)];
> + int8_t s8[VUNIT(8)];
> + } vreg[32];
> + target_ulong vxrm;
> + target_ulong vxsat;
> + target_ulong vl;
> + target_ulong vstart;
> + target_ulong vtype;
> + float_status fp_status;
> + } vfp;
Is there a good reason why you're putting all of these into a sub-structure?
And more, a sub-structure whose name, vfp, looks like it is copied from ARM?
Why are the vxrm, vxsat, vl, vstart, vtype fields sized target_ulong? I would
think that most could be uint32_t. Although I suppose frm is also target_ulong
and need not be...
Why are you adding a new fp_status field? The new vector floating point
instructions set the exact same fflags exception bits as normal fp instructions.
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 02/17] RISC-V: turn on vector extension from command line by cfg.ext_v Property
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-11 15:00 ` Chih-Min Chao
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 03/17] RISC-V: support vector extension csr liuzhiwei
` (15 subsequent siblings)
17 siblings, 1 reply; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/cpu.c | 6 +++++-
target/riscv/cpu.h | 2 ++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index f8d07bd..9f93ce7 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -109,7 +109,7 @@ static void set_resetvec(CPURISCVState *env, int resetvec)
static void riscv_any_cpu_init(Object *obj)
{
CPURISCVState *env = &RISCV_CPU(obj)->env;
- set_misa(env, RVXLEN | RVI | RVM | RVA | RVF | RVD | RVC | RVU);
+ set_misa(env, RVXLEN | RVI | RVM | RVA | RVF | RVD | RVC | RVU | RVV);
set_priv_version(env, PRIV_VERSION_1_11_0);
set_resetvec(env, DEFAULT_RSTVEC);
}
@@ -406,6 +406,9 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp)
if (cpu->cfg.ext_u) {
target_misa |= RVU;
}
+ if (cpu->cfg.ext_v) {
+ target_misa |= RVV;
+ }
set_misa(env, RVXLEN | target_misa);
}
@@ -441,6 +444,7 @@ static Property riscv_cpu_properties[] = {
DEFINE_PROP_BOOL("c", RISCVCPU, cfg.ext_c, true),
DEFINE_PROP_BOOL("s", RISCVCPU, cfg.ext_s, true),
DEFINE_PROP_BOOL("u", RISCVCPU, cfg.ext_u, true),
+ DEFINE_PROP_BOOL("v", RISCVCPU, cfg.ext_v, true),
DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true),
DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true),
DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true),
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index c992b1d..2c7072a 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -67,6 +67,7 @@
#define RVC RV('C')
#define RVS RV('S')
#define RVU RV('U')
+#define RVV RV('V')
/* S extension denotes that Supervisor mode exists, however it is possible
to have a core that support S mode but does not have an MMU and there
@@ -250,6 +251,7 @@ typedef struct RISCVCPU {
bool ext_c;
bool ext_s;
bool ext_u;
+ bool ext_v;
bool ext_counters;
bool ext_ifencei;
bool ext_icsr;
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 02/17] RISC-V: turn on vector extension from command line by cfg.ext_v Property
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 02/17] RISC-V: turn on vector extension from command line by cfg.ext_v Property liuzhiwei
@ 2019-09-11 15:00 ` Chih-Min Chao
0 siblings, 0 replies; 43+ messages in thread
From: Chih-Min Chao @ 2019-09-11 15:00 UTC (permalink / raw)
To: liuzhiwei
Cc: Palmer Dabbelt, open list:RISC-V, Sagar Karandikar,
Bastian Koppelmann, riku.voipio, laurent, wxy194768,
qemu-devel@nongnu.org Developers, wenmeng_zhang,
Alistair Francis
On Wed, Sep 11, 2019 at 2:36 PM liuzhiwei <zhiwei_liu@c-sky.com> wrote:
> From: LIU Zhiwei <zhiwei_liu@c-sky.com>
>
> Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
> ---
> target/riscv/cpu.c | 6 +++++-
> target/riscv/cpu.h | 2 ++
> 2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index f8d07bd..9f93ce7 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -109,7 +109,7 @@ static void set_resetvec(CPURISCVState *env, int
> resetvec)
> static void riscv_any_cpu_init(Object *obj)
> {
> CPURISCVState *env = &RISCV_CPU(obj)->env;
> - set_misa(env, RVXLEN | RVI | RVM | RVA | RVF | RVD | RVC | RVU);
> + set_misa(env, RVXLEN | RVI | RVM | RVA | RVF | RVD | RVC | RVU | RVV);
> set_priv_version(env, PRIV_VERSION_1_11_0);
> set_resetvec(env, DEFAULT_RSTVEC);
> }
> @@ -406,6 +406,9 @@ static void riscv_cpu_realize(DeviceState *dev, Error
> **errp)
> if (cpu->cfg.ext_u) {
> target_misa |= RVU;
> }
> + if (cpu->cfg.ext_v) {
> + target_misa |= RVV;
> + }
>
> set_misa(env, RVXLEN | target_misa);
> }
> @@ -441,6 +444,7 @@ static Property riscv_cpu_properties[] = {
> DEFINE_PROP_BOOL("c", RISCVCPU, cfg.ext_c, true),
> DEFINE_PROP_BOOL("s", RISCVCPU, cfg.ext_s, true),
> DEFINE_PROP_BOOL("u", RISCVCPU, cfg.ext_u, true),
> + DEFINE_PROP_BOOL("v", RISCVCPU, cfg.ext_v, true),
> DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true),
> DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true),
> DEFINE_PROP_BOOL("Zicsr", RISCVCPU, cfg.ext_icsr, true),
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index c992b1d..2c7072a 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -67,6 +67,7 @@
> #define RVC RV('C')
> #define RVS RV('S')
> #define RVU RV('U')
> +#define RVV RV('V')
>
> /* S extension denotes that Supervisor mode exists, however it is possible
> to have a core that support S mode but does not have an MMU and there
> @@ -250,6 +251,7 @@ typedef struct RISCVCPU {
> bool ext_c;
> bool ext_s;
> bool ext_u;
> + bool ext_v;
> bool ext_counters;
> bool ext_ifencei;
> bool ext_icsr;
> --
> 2.7.4
>
>
> Reviewed-by: Chih-Min Chao <chihmin.chao@sifive.com>
^ permalink raw reply [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 03/17] RISC-V: support vector extension csr
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 01/17] RISC-V: add vfp field in CPURISCVState liuzhiwei
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 02/17] RISC-V: turn on vector extension from command line by cfg.ext_v Property liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-11 15:25 ` [Qemu-devel] [Qemu-riscv] " Chih-Min Chao
2019-09-11 22:43 ` [Qemu-devel] " Richard Henderson
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 04/17] RISC-V: add vector extension configure instruction liuzhiwei
` (14 subsequent siblings)
17 siblings, 2 replies; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/cpu_bits.h | 15 ++++++++++++
target/riscv/csr.c | 65 ++++++++++++++++++++++++++++++++++++++++++++++---
2 files changed, 76 insertions(+), 4 deletions(-)
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index 11f971a..9eb43ec 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -29,6 +29,14 @@
#define FSR_NXA (FPEXC_NX << FSR_AEXC_SHIFT)
#define FSR_AEXC (FSR_NVA | FSR_OFA | FSR_UFA | FSR_DZA | FSR_NXA)
+/* Vector Fixed-Point round model */
+#define FSR_VXRM_SHIFT 9
+#define FSR_VXRM (0x3 << FSR_VXRM_SHIFT)
+
+/* Vector Fixed-Point saturation flag */
+#define FSR_VXSAT_SHIFT 8
+#define FSR_VXSAT (0x1 << FSR_VXSAT_SHIFT)
+
/* Control and Status Registers */
/* User Trap Setup */
@@ -48,6 +56,13 @@
#define CSR_FRM 0x002
#define CSR_FCSR 0x003
+/* User Vector CSRs */
+#define CSR_VSTART 0x008
+#define CSR_VXSAT 0x009
+#define CSR_VXRM 0x00a
+#define CSR_VL 0xc20
+#define CSR_VTYPE 0xc21
+
/* User Timers and Counters */
#define CSR_CYCLE 0xc00
#define CSR_TIME 0xc01
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index e0d4586..a6131ff 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -87,12 +87,12 @@ static int ctr(CPURISCVState *env, int csrno)
return 0;
}
-#if !defined(CONFIG_USER_ONLY)
static int any(CPURISCVState *env, int csrno)
{
return 0;
}
+#if !defined(CONFIG_USER_ONLY)
static int smode(CPURISCVState *env, int csrno)
{
return -!riscv_has_ext(env, RVS);
@@ -158,8 +158,10 @@ static int read_fcsr(CPURISCVState *env, int csrno, target_ulong *val)
return -1;
}
#endif
- *val = (riscv_cpu_get_fflags(env) << FSR_AEXC_SHIFT)
- | (env->frm << FSR_RD_SHIFT);
+ *val = (env->vfp.vxrm << FSR_VXRM_SHIFT)
+ | (env->vfp.vxsat << FSR_VXSAT_SHIFT)
+ | (riscv_cpu_get_fflags(env) << FSR_AEXC_SHIFT)
+ | (env->frm << FSR_RD_SHIFT);
return 0;
}
@@ -172,10 +174,60 @@ static int write_fcsr(CPURISCVState *env, int csrno, target_ulong val)
env->mstatus |= MSTATUS_FS;
#endif
env->frm = (val & FSR_RD) >> FSR_RD_SHIFT;
+ env->vfp.vxrm = (val & FSR_VXRM) >> FSR_VXRM_SHIFT;
+ env->vfp.vxsat = (val & FSR_VXSAT) >> FSR_VXSAT_SHIFT;
riscv_cpu_set_fflags(env, (val & FSR_AEXC) >> FSR_AEXC_SHIFT);
return 0;
}
+static int read_vtype(CPURISCVState *env, int csrno, target_ulong *val)
+{
+ *val = env->vfp.vtype;
+ return 0;
+}
+
+static int read_vl(CPURISCVState *env, int csrno, target_ulong *val)
+{
+ *val = env->vfp.vl;
+ return 0;
+}
+
+static int read_vxrm(CPURISCVState *env, int csrno, target_ulong *val)
+{
+ *val = env->vfp.vxrm;
+ return 0;
+}
+
+static int read_vxsat(CPURISCVState *env, int csrno, target_ulong *val)
+{
+ *val = env->vfp.vxsat;
+ return 0;
+}
+
+static int read_vstart(CPURISCVState *env, int csrno, target_ulong *val)
+{
+ *val = env->vfp.vstart;
+ return 0;
+}
+
+static int write_vxrm(CPURISCVState *env, int csrno, target_ulong val)
+{
+ env->vfp.vxrm = val;
+ return 0;
+}
+
+static int write_vxsat(CPURISCVState *env, int csrno, target_ulong val)
+{
+ env->vfp.vxsat = val;
+ return 0;
+}
+
+static int write_vstart(CPURISCVState *env, int csrno, target_ulong val)
+{
+ env->vfp.vstart = val;
+ return 0;
+}
+
/* User Timers and Counters */
static int read_instret(CPURISCVState *env, int csrno, target_ulong *val)
{
@@ -873,7 +925,12 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
[CSR_FFLAGS] = { fs, read_fflags, write_fflags },
[CSR_FRM] = { fs, read_frm, write_frm },
[CSR_FCSR] = { fs, read_fcsr, write_fcsr },
-
+ /* Vector CSRs */
+ [CSR_VSTART] = { any, read_vstart, write_vstart },
+ [CSR_VXSAT] = { any, read_vxsat, write_vxsat },
+ [CSR_VXRM] = { any, read_vxrm, write_vxrm },
+ [CSR_VL] = { any, read_vl },
+ [CSR_VTYPE] = { any, read_vtype },
/* User Timers and Counters */
[CSR_CYCLE] = { ctr, read_instret },
[CSR_INSTRET] = { ctr, read_instret },
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [Qemu-riscv] [PATCH v2 03/17] RISC-V: support vector extension csr
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 03/17] RISC-V: support vector extension csr liuzhiwei
@ 2019-09-11 15:25 ` Chih-Min Chao
2019-09-11 22:43 ` [Qemu-devel] " Richard Henderson
1 sibling, 0 replies; 43+ messages in thread
From: Chih-Min Chao @ 2019-09-11 15:25 UTC (permalink / raw)
To: liuzhiwei
Cc: Palmer Dabbelt, open list:RISC-V, Sagar Karandikar,
Bastian Koppelmann, riku.voipio, laurent, wxy194768,
qemu-devel@nongnu.org Developers, wenmeng_zhang,
Alistair Francis
On Wed, Sep 11, 2019 at 2:38 PM liuzhiwei <zhiwei_liu@c-sky.com> wrote:
> From: LIU Zhiwei <zhiwei_liu@c-sky.com>
>
> Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
> ---
> target/riscv/cpu_bits.h | 15 ++++++++++++
> target/riscv/csr.c | 65
> ++++++++++++++++++++++++++++++++++++++++++++++---
> 2 files changed, 76 insertions(+), 4 deletions(-)
>
> diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
> index 11f971a..9eb43ec 100644
> --- a/target/riscv/cpu_bits.h
> +++ b/target/riscv/cpu_bits.h
> @@ -29,6 +29,14 @@
> #define FSR_NXA (FPEXC_NX << FSR_AEXC_SHIFT)
> #define FSR_AEXC (FSR_NVA | FSR_OFA | FSR_UFA | FSR_DZA |
> FSR_NXA)
>
> +/* Vector Fixed-Point round model */
> +#define FSR_VXRM_SHIFT 9
> +#define FSR_VXRM (0x3 << FSR_VXRM_SHIFT)
> +
> +/* Vector Fixed-Point saturation flag */
> +#define FSR_VXSAT_SHIFT 8
> +#define FSR_VXSAT (0x1 << FSR_VXSAT_SHIFT)
> +
> /* Control and Status Registers */
>
> /* User Trap Setup */
> @@ -48,6 +56,13 @@
> #define CSR_FRM 0x002
> #define CSR_FCSR 0x003
>
> +/* User Vector CSRs */
> +#define CSR_VSTART 0x008
> +#define CSR_VXSAT 0x009
> +#define CSR_VXRM 0x00a
> +#define CSR_VL 0xc20
> +#define CSR_VTYPE 0xc21
> +
> /* User Timers and Counters */
> #define CSR_CYCLE 0xc00
> #define CSR_TIME 0xc01
> diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> index e0d4586..a6131ff 100644
> --- a/target/riscv/csr.c
> +++ b/target/riscv/csr.c
> @@ -87,12 +87,12 @@ static int ctr(CPURISCVState *env, int csrno)
> return 0;
> }
>
> -#if !defined(CONFIG_USER_ONLY)
> static int any(CPURISCVState *env, int csrno)
> {
> return 0;
> }
>
> +#if !defined(CONFIG_USER_ONLY)
> static int smode(CPURISCVState *env, int csrno)
> {
> return -!riscv_has_ext(env, RVS);
> @@ -158,8 +158,10 @@ static int read_fcsr(CPURISCVState *env, int csrno,
> target_ulong *val)
> return -1;
> }
> #endif
> - *val = (riscv_cpu_get_fflags(env) << FSR_AEXC_SHIFT)
> - | (env->frm << FSR_RD_SHIFT);
> + *val = (env->vfp.vxrm << FSR_VXRM_SHIFT)
> + | (env->vfp.vxsat << FSR_VXSAT_SHIFT)
> + | (riscv_cpu_get_fflags(env) << FSR_AEXC_SHIFT)
> + | (env->frm << FSR_RD_SHIFT);
> return 0;
> }
>
> @@ -172,10 +174,60 @@ static int write_fcsr(CPURISCVState *env, int csrno,
> target_ulong val)
> env->mstatus |= MSTATUS_FS;
> #endif
> env->frm = (val & FSR_RD) >> FSR_RD_SHIFT;
> + env->vfp.vxrm = (val & FSR_VXRM) >> FSR_VXRM_SHIFT;
> + env->vfp.vxsat = (val & FSR_VXSAT) >> FSR_VXSAT_SHIFT;
> riscv_cpu_set_fflags(env, (val & FSR_AEXC) >> FSR_AEXC_SHIFT);
> return 0;
> }
>
> +static int read_vtype(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vtype;
> + return 0;
> +}
> +
> +static int read_vl(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vl;
> + return 0;
> +}
> +
> +static int read_vxrm(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vxrm;
> + return 0;
> +}
> +
> +static int read_vxsat(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vxsat;
> + return 0;
> +}
> +
> +static int read_vstart(CPURISCVState *env, int csrno, target_ulong *val)
> +{
> + *val = env->vfp.vstart;
> + return 0;
> +}
> +
> +static int write_vxrm(CPURISCVState *env, int csrno, target_ulong val)
> +{
> + env->vfp.vxrm = val;
> + return 0;
> +}
> +
> +static int write_vxsat(CPURISCVState *env, int csrno, target_ulong val)
> +{
> + env->vfp.vxsat = val;
> + return 0;
> +}
> +
> +static int write_vstart(CPURISCVState *env, int csrno, target_ulong val)
> +{
> + env->vfp.vstart = val;
> + return 0;
> +}
> +
> /* User Timers and Counters */
> static int read_instret(CPURISCVState *env, int csrno, target_ulong *val)
> {
> @@ -873,7 +925,12 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] =
> {
> [CSR_FFLAGS] = { fs, read_fflags, write_fflags
> },
> [CSR_FRM] = { fs, read_frm, write_frm
> },
> [CSR_FCSR] = { fs, read_fcsr, write_fcsr
> },
> -
> + /* Vector CSRs */
> + [CSR_VSTART] = { any, read_vstart, write_vstart
> },
> + [CSR_VXSAT] = { any, read_vxsat, write_vxsat
> },
> + [CSR_VXRM] = { any, read_vxrm, write_vxrm
> },
> + [CSR_VL] = { any, read_vl
> },
> + [CSR_VTYPE] = { any, read_vtype
> },
> /* User Timers and Counters */
> [CSR_CYCLE] = { ctr, read_instret
> },
> [CSR_INSTRET] = { ctr, read_instret
> },
> --
> 2.7.4
>
>
>
Reviewed-by: Chih-Min Chao <chihmin.chao@sifive.com>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 03/17] RISC-V: support vector extension csr
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 03/17] RISC-V: support vector extension csr liuzhiwei
2019-09-11 15:25 ` [Qemu-devel] [Qemu-riscv] " Chih-Min Chao
@ 2019-09-11 22:43 ` Richard Henderson
2019-09-14 13:58 ` Palmer Dabbelt
1 sibling, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-09-11 22:43 UTC (permalink / raw)
To: liuzhiwei, Alistair.Francis, palmer, sagark, kbastian,
riku.voipio, laurent, wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768
On 9/11/19 2:25 AM, liuzhiwei wrote:
> @@ -873,7 +925,12 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
> [CSR_FFLAGS] = { fs, read_fflags, write_fflags },
> [CSR_FRM] = { fs, read_frm, write_frm },
> [CSR_FCSR] = { fs, read_fcsr, write_fcsr },
> -
> + /* Vector CSRs */
> + [CSR_VSTART] = { any, read_vstart, write_vstart },
> + [CSR_VXSAT] = { any, read_vxsat, write_vxsat },
> + [CSR_VXRM] = { any, read_vxrm, write_vxrm },
> + [CSR_VL] = { any, read_vl },
> + [CSR_VTYPE] = { any, read_vtype },
Is there really no MSTATUS bit to disable the vector unit,
as there is for the FPU? That seems like a defect in the
specification if true...
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 03/17] RISC-V: support vector extension csr
2019-09-11 22:43 ` [Qemu-devel] " Richard Henderson
@ 2019-09-14 13:58 ` Palmer Dabbelt
0 siblings, 0 replies; 43+ messages in thread
From: Palmer Dabbelt @ 2019-09-14 13:58 UTC (permalink / raw)
To: richard.henderson
Cc: qemu-riscv, sagark, Bastian Koppelmann, riku.voipio, laurent,
wxy194768, qemu-devel, wenmeng_zhang, Alistair Francis,
zhiwei_liu
On Wed, 11 Sep 2019 15:43:29 PDT (-0700), richard.henderson@linaro.org wrote:
> On 9/11/19 2:25 AM, liuzhiwei wrote:
>> @@ -873,7 +925,12 @@ static riscv_csr_operations csr_ops[CSR_TABLE_SIZE] = {
>> [CSR_FFLAGS] = { fs, read_fflags, write_fflags },
>> [CSR_FRM] = { fs, read_frm, write_frm },
>> [CSR_FCSR] = { fs, read_fcsr, write_fcsr },
>> -
>> + /* Vector CSRs */
>> + [CSR_VSTART] = { any, read_vstart, write_vstart },
>> + [CSR_VXSAT] = { any, read_vxsat, write_vxsat },
>> + [CSR_VXRM] = { any, read_vxrm, write_vxrm },
>> + [CSR_VL] = { any, read_vl },
>> + [CSR_VTYPE] = { any, read_vtype },
>
> Is there really no MSTATUS bit to disable the vector unit,
> as there is for the FPU? That seems like a defect in the
> specification if true...
The privileged part of the V extension hasn't been written yet, which is part
of the reason this is a draft that we know will change. We're letting it into
QEMU so people can more easily prototype software, but won't be letting it into
Linux or GCC to avoid users depending on behavior that will change in the
future.
^ permalink raw reply [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 04/17] RISC-V: add vector extension configure instruction
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (2 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 03/17] RISC-V: support vector extension csr liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-11 16:04 ` [Qemu-devel] [Qemu-riscv] " Chih-Min Chao
2019-09-11 23:09 ` [Qemu-devel] " Richard Henderson
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 05/17] RISC-V: add vector extension load and store instructions liuzhiwei
` (13 subsequent siblings)
17 siblings, 2 replies; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/Makefile.objs | 2 +-
target/riscv/helper.h | 3 +
target/riscv/insn32.decode | 5 ++
target/riscv/insn_trans/trans_rvv.inc.c | 46 ++++++++++++
target/riscv/translate.c | 1 +
target/riscv/vector_helper.c | 126 ++++++++++++++++++++++++++++++++
6 files changed, 182 insertions(+), 1 deletion(-)
create mode 100644 target/riscv/insn_trans/trans_rvv.inc.c
create mode 100644 target/riscv/vector_helper.c
diff --git a/target/riscv/Makefile.objs b/target/riscv/Makefile.objs
index b1c79bc..d577cef 100644
--- a/target/riscv/Makefile.objs
+++ b/target/riscv/Makefile.objs
@@ -1,4 +1,4 @@
-obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o gdbstub.o pmp.o
+obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o vector_helper.o gdbstub.o pmp.o
DECODETREE = $(SRC_PATH)/scripts/decodetree.py
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index debb22a..652f8c3 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -76,3 +76,6 @@ DEF_HELPER_2(mret, tl, env, tl)
DEF_HELPER_1(wfi, void, env)
DEF_HELPER_1(tlb_flush, void, env)
#endif
+/* Vector functions */
+DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 77f794e..5dc009c 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -62,6 +62,7 @@
@r_rm ....... ..... ..... ... ..... ....... %rs2 %rs1 %rm %rd
@r2_rm ....... ..... ..... ... ..... ....... %rs1 %rm %rd
@r2 ....... ..... ..... ... ..... ....... %rs1 %rd
+@r2_zimm . zimm:11 ..... ... ..... ....... %rs1 %rd
@sfence_vma ....... ..... ..... ... ..... ....... %rs2 %rs1
@sfence_vm ....... ..... ..... ... ..... ....... %rs1
@@ -203,3 +204,7 @@ fcvt_w_d 1100001 00000 ..... ... ..... 1010011 @r2_rm
fcvt_wu_d 1100001 00001 ..... ... ..... 1010011 @r2_rm
fcvt_d_w 1101001 00000 ..... ... ..... 1010011 @r2_rm
fcvt_d_wu 1101001 00001 ..... ... ..... 1010011 @r2_rm
+
+# *** RV32V Extension ***
+vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
+vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
new file mode 100644
index 0000000..82e7ad6
--- /dev/null
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -0,0 +1,46 @@
+/*
+ * RISC-V translation routines for the RVV Standard Extension.
+ *
+ * Copyright (c) 2019 C-SKY Limited. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define GEN_VECTOR_R(INSN) \
+static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
+{ \
+ TCGv_i32 s1 = tcg_const_i32(a->rs1); \
+ TCGv_i32 s2 = tcg_const_i32(a->rs2); \
+ TCGv_i32 d = tcg_const_i32(a->rd); \
+ gen_helper_vector_##INSN(cpu_env, s1, s2, d); \
+ tcg_temp_free_i32(s1); \
+ tcg_temp_free_i32(s2); \
+ tcg_temp_free_i32(d); \
+ return true; \
+}
+
+#define GEN_VECTOR_R2_ZIMM(INSN) \
+static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
+{ \
+ TCGv_i32 s1 = tcg_const_i32(a->rs1); \
+ TCGv_i32 zimm = tcg_const_i32(a->zimm); \
+ TCGv_i32 d = tcg_const_i32(a->rd); \
+ gen_helper_vector_##INSN(cpu_env, s1, zimm, d); \
+ tcg_temp_free_i32(s1); \
+ tcg_temp_free_i32(zimm); \
+ tcg_temp_free_i32(d); \
+ return true; \
+}
+
+GEN_VECTOR_R2_ZIMM(vsetvli)
+GEN_VECTOR_R(vsetvl)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 8d6ab73..587c23e 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -706,6 +706,7 @@ static bool gen_shift(DisasContext *ctx, arg_r *a,
#include "insn_trans/trans_rva.inc.c"
#include "insn_trans/trans_rvf.inc.c"
#include "insn_trans/trans_rvd.inc.c"
+#include "insn_trans/trans_rvv.inc.c"
#include "insn_trans/trans_privileged.inc.c"
/*
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
new file mode 100644
index 0000000..b279e6f
--- /dev/null
+++ b/target/riscv/vector_helper.c
@@ -0,0 +1,126 @@
+/*
+ * RISC-V Vectore Extension Helpers for QEMU.
+ *
+ * Copyright (c) 2019 C-SKY Limited. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "exec/exec-all.h"
+#include "exec/helper-proto.h"
+#include <math.h>
+
+#define VECTOR_HELPER(name) HELPER(glue(vector_, name))
+
+static inline void vector_vtype_set_ill(CPURISCVState *env)
+{
+ env->vfp.vtype = ((target_ulong)1) << (sizeof(target_ulong) - 1);
+ return;
+}
+
+static inline int vector_vtype_get_sew(CPURISCVState *env)
+{
+ return (env->vfp.vtype >> 2) & 0x7;
+}
+
+static inline int vector_get_width(CPURISCVState *env)
+{
+ return 8 * (1 << vector_vtype_get_sew(env));
+}
+
+static inline int vector_get_lmul(CPURISCVState *env)
+{
+ return 1 << (env->vfp.vtype & 0x3);
+}
+
+static inline int vector_get_vlmax(CPURISCVState *env)
+{
+ return vector_get_lmul(env) * VLEN / vector_get_width(env);
+}
+
+void VECTOR_HELPER(vsetvl)(CPURISCVState *env, uint32_t rs1, uint32_t rs2,
+ uint32_t rd)
+{
+ int sew, max_sew, vlmax, vl;
+
+ if (rs2 == 0) {
+ vector_vtype_set_ill(env);
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ env->vfp.vtype = env->gpr[rs2];
+ sew = 1 << vector_get_width(env) / 8;
+ max_sew = sizeof(target_ulong);
+
+ if (env->misa & RVD) {
+ max_sew = max_sew > 8 ? max_sew : 8;
+ } else if (env->misa & RVF) {
+ max_sew = max_sew > 4 ? max_sew : 4;
+ }
+ if (sew > max_sew) {
+ vector_vtype_set_ill(env);
+ return;
+ }
+
+ vlmax = vector_get_vlmax(env);
+ if (rs1 == 0) {
+ vl = vlmax;
+ } else if (env->gpr[rs1] <= vlmax) {
+ vl = env->gpr[rs1];
+ } else if (env->gpr[rs1] < 2 * vlmax) {
+ vl = ceil(env->gpr[rs1] / 2);
+ } else {
+ vl = vlmax;
+ }
+ env->vfp.vl = vl;
+ env->gpr[rd] = vl;
+ env->vfp.vstart = 0;
+ return;
+}
+
+void VECTOR_HELPER(vsetvli)(CPURISCVState *env, uint32_t rs1, uint32_t zimm,
+ uint32_t rd)
+{
+ int sew, max_sew, vlmax, vl;
+
+ env->vfp.vtype = zimm;
+ sew = vector_get_width(env) / 8;
+ max_sew = sizeof(target_ulong);
+
+ if (env->misa & RVD) {
+ max_sew = max_sew > 8 ? max_sew : 8;
+ } else if (env->misa & RVF) {
+ max_sew = max_sew > 4 ? max_sew : 4;
+ }
+ if (sew > max_sew) {
+ vector_vtype_set_ill(env);
+ return;
+ }
+
+ vlmax = vector_get_vlmax(env);
+ if (rs1 == 0) {
+ vl = vlmax;
+ } else if (env->gpr[rs1] <= vlmax) {
+ vl = env->gpr[rs1];
+ } else if (env->gpr[rs1] < 2 * vlmax) {
+ vl = ceil(env->gpr[rs1] / 2);
+ } else {
+ vl = vlmax;
+ }
+ env->vfp.vl = vl;
+ env->gpr[rd] = vl;
+ env->vfp.vstart = 0;
+ return;
+}
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [Qemu-riscv] [PATCH v2 04/17] RISC-V: add vector extension configure instruction
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 04/17] RISC-V: add vector extension configure instruction liuzhiwei
@ 2019-09-11 16:04 ` Chih-Min Chao
2019-09-11 23:09 ` [Qemu-devel] " Richard Henderson
1 sibling, 0 replies; 43+ messages in thread
From: Chih-Min Chao @ 2019-09-11 16:04 UTC (permalink / raw)
To: liuzhiwei
Cc: Palmer Dabbelt, open list:RISC-V, Sagar Karandikar,
Bastian Koppelmann, riku.voipio, laurent, wxy194768,
qemu-devel@nongnu.org Developers, wenmeng_zhang,
Alistair Francis
On Wed, Sep 11, 2019 at 2:38 PM liuzhiwei <zhiwei_liu@c-sky.com> wrote:
> From: LIU Zhiwei <zhiwei_liu@c-sky.com>
>
> Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
> ---
> target/riscv/Makefile.objs | 2 +-
> target/riscv/helper.h | 3 +
> target/riscv/insn32.decode | 5 ++
> target/riscv/insn_trans/trans_rvv.inc.c | 46 ++++++++++++
> target/riscv/translate.c | 1 +
> target/riscv/vector_helper.c | 126
> ++++++++++++++++++++++++++++++++
> 6 files changed, 182 insertions(+), 1 deletion(-)
> create mode 100644 target/riscv/insn_trans/trans_rvv.inc.c
> create mode 100644 target/riscv/vector_helper.c
>
> diff --git a/target/riscv/Makefile.objs b/target/riscv/Makefile.objs
> index b1c79bc..d577cef 100644
> --- a/target/riscv/Makefile.objs
> +++ b/target/riscv/Makefile.objs
> @@ -1,4 +1,4 @@
> -obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o
> gdbstub.o pmp.o
> +obj-y += translate.o op_helper.o cpu_helper.o cpu.o csr.o fpu_helper.o
> vector_helper.o gdbstub.o pmp.o
>
> DECODETREE = $(SRC_PATH)/scripts/decodetree.py
>
> diff --git a/target/riscv/helper.h b/target/riscv/helper.h
> index debb22a..652f8c3 100644
> --- a/target/riscv/helper.h
> +++ b/target/riscv/helper.h
> @@ -76,3 +76,6 @@ DEF_HELPER_2(mret, tl, env, tl)
> DEF_HELPER_1(wfi, void, env)
> DEF_HELPER_1(tlb_flush, void, env)
> #endif
> +/* Vector functions */
> +DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
> +DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
> diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
> index 77f794e..5dc009c 100644
> --- a/target/riscv/insn32.decode
> +++ b/target/riscv/insn32.decode
> @@ -62,6 +62,7 @@
> @r_rm ....... ..... ..... ... ..... ....... %rs2 %rs1 %rm %rd
> @r2_rm ....... ..... ..... ... ..... ....... %rs1 %rm %rd
> @r2 ....... ..... ..... ... ..... ....... %rs1 %rd
> +@r2_zimm . zimm:11 ..... ... ..... ....... %rs1 %rd
>
> @sfence_vma ....... ..... ..... ... ..... ....... %rs2 %rs1
> @sfence_vm ....... ..... ..... ... ..... ....... %rs1
> @@ -203,3 +204,7 @@ fcvt_w_d 1100001 00000 ..... ... ..... 1010011
> @r2_rm
> fcvt_wu_d 1100001 00001 ..... ... ..... 1010011 @r2_rm
> fcvt_d_w 1101001 00000 ..... ... ..... 1010011 @r2_rm
> fcvt_d_wu 1101001 00001 ..... ... ..... 1010011 @r2_rm
> +
> +# *** RV32V Extension ***
> +vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
> +vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
> diff --git a/target/riscv/insn_trans/trans_rvv.inc.c
> b/target/riscv/insn_trans/trans_rvv.inc.c
> new file mode 100644
> index 0000000..82e7ad6
> --- /dev/null
> +++ b/target/riscv/insn_trans/trans_rvv.inc.c
> @@ -0,0 +1,46 @@
> +/*
> + * RISC-V translation routines for the RVV Standard Extension.
> + *
> + * Copyright (c) 2019 C-SKY Limited. All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
> for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License
> along with
> + * this program. If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#define GEN_VECTOR_R(INSN) \
> +static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
> +{ \
> + TCGv_i32 s1 = tcg_const_i32(a->rs1); \
> + TCGv_i32 s2 = tcg_const_i32(a->rs2); \
> + TCGv_i32 d = tcg_const_i32(a->rd); \
> + gen_helper_vector_##INSN(cpu_env, s1, s2, d); \
> + tcg_temp_free_i32(s1); \
> + tcg_temp_free_i32(s2); \
> + tcg_temp_free_i32(d); \
> + return true; \
> +}
> +
> +#define GEN_VECTOR_R2_ZIMM(INSN) \
> +static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
> +{ \
> + TCGv_i32 s1 = tcg_const_i32(a->rs1); \
> + TCGv_i32 zimm = tcg_const_i32(a->zimm); \
> + TCGv_i32 d = tcg_const_i32(a->rd); \
> + gen_helper_vector_##INSN(cpu_env, s1, zimm, d); \
> + tcg_temp_free_i32(s1); \
> + tcg_temp_free_i32(zimm); \
> + tcg_temp_free_i32(d); \
> + return true; \
> +}
> +
> +GEN_VECTOR_R2_ZIMM(vsetvli)
> +GEN_VECTOR_R(vsetvl)
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index 8d6ab73..587c23e 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -706,6 +706,7 @@ static bool gen_shift(DisasContext *ctx, arg_r *a,
> #include "insn_trans/trans_rva.inc.c"
> #include "insn_trans/trans_rvf.inc.c"
> #include "insn_trans/trans_rvd.inc.c"
> +#include "insn_trans/trans_rvv.inc.c"
> #include "insn_trans/trans_privileged.inc.c"
>
> /*
> diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
> new file mode 100644
> index 0000000..b279e6f
> --- /dev/null
> +++ b/target/riscv/vector_helper.c
> @@ -0,0 +1,126 @@
> +/*
> + * RISC-V Vectore Extension Helpers for QEMU.
> + *
> + * Copyright (c) 2019 C-SKY Limited. All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
> for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License
> along with
> + * this program. If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "cpu.h"
> +#include "exec/exec-all.h"
> +#include "exec/helper-proto.h"
> +#include <math.h>
> +
> +#define VECTOR_HELPER(name) HELPER(glue(vector_, name))
> +
> +static inline void vector_vtype_set_ill(CPURISCVState *env)
> +{
> + env->vfp.vtype = ((target_ulong)1) << (sizeof(target_ulong) - 1);
> + return;
> +}
> +
>
env->vfp.vtype = ((target_ulong)1) << (sizeof(target_ulong) * 8 - 1);
> +static inline int vector_vtype_get_sew(CPURISCVState *env)
> +{
> + return (env->vfp.vtype >> 2) & 0x7;
> +}
> +
>
extract64(env->vfp.vtype, 2, 3);
> +static inline int vector_get_width(CPURISCVState *env)
> +{
> + return 8 * (1 << vector_vtype_get_sew(env));
> +}
> +
> +static inline int vector_get_lmul(CPURISCVState *env)
> +{
> + return 1 << (env->vfp.vtype & 0x3);
> +}
> +
>
extract64(env->vfp.vtype, 0, 2);
> +static inline int vector_get_vlmax(CPURISCVState *env)
> +{
> + return vector_get_lmul(env) * VLEN / vector_get_width(env);
> +}
> +
> +void VECTOR_HELPER(vsetvl)(CPURISCVState *env, uint32_t rs1, uint32_t rs2,
> + uint32_t rd)
> +{
> + int sew, max_sew, vlmax, vl;
> +
> + if (rs2 == 0) {
> + vector_vtype_set_ill(env);
> + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
> + return;
> + }
> + env->vfp.vtype = env->gpr[rs2];
> + sew = 1 << vector_get_width(env) / 8;
> + max_sew = sizeof(target_ulong);
> +
> + if (env->misa & RVD) {
> + max_sew = max_sew > 8 ? max_sew : 8;
> + } else if (env->misa & RVF) {
> + max_sew = max_sew > 4 ? max_sew : 4;
> + }
As far as i understand, max_sew is defined by ELEN but not by existing
floating-point extensions.
ELEN should be configurable through command line cpu parameter.
+ if (sew > max_sew) {
> + vector_vtype_set_ill(env);
> + return;
> + }
> +
> + vlmax = vector_get_vlmax(env);
> + if (rs1 == 0) {
> + vl = vlmax;
> + } else if (env->gpr[rs1] <= vlmax) {
> + vl = env->gpr[rs1];
> + } else if (env->gpr[rs1] < 2 * vlmax) {
> + vl = ceil(env->gpr[rs1] / 2);
> + } else {
> + vl = vlmax;
> + }
> + env->vfp.vl = vl;
> + env->gpr[rd] = vl;
> + env->vfp.vstart = 0;
> + return;
> +}
> +
> +void VECTOR_HELPER(vsetvli)(CPURISCVState *env, uint32_t rs1, uint32_t
> zimm,
> + uint32_t rd)
> +{
> + int sew, max_sew, vlmax, vl;
> +
> + env->vfp.vtype = zimm;
> + sew = vector_get_width(env) / 8;
> + max_sew = sizeof(target_ulong);
> +
> + if (env->misa & RVD) {
> + max_sew = max_sew > 8 ? max_sew : 8;
> + } else if (env->misa & RVF) {
> + max_sew = max_sew > 4 ? max_sew : 4;
> + }
> + if (sew > max_sew) {
> + vector_vtype_set_ill(env);
> + return;
> + }
> +
The same comment described above.
> + vlmax = vector_get_vlmax(env);
> + if (rs1 == 0) {
> + vl = vlmax;
> + } else if (env->gpr[rs1] <= vlmax) {
> + vl = env->gpr[rs1];
> + } else if (env->gpr[rs1] < 2 * vlmax) {
> + vl = ceil(env->gpr[rs1] / 2);
> + } else {
> + vl = vlmax;
> + }
> + env->vfp.vl = vl;
> + env->gpr[rd] = vl;
> + env->vfp.vstart = 0;
> + return;
> +}
> --
> 2.7.4
>
>
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 04/17] RISC-V: add vector extension configure instruction
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 04/17] RISC-V: add vector extension configure instruction liuzhiwei
2019-09-11 16:04 ` [Qemu-devel] [Qemu-riscv] " Chih-Min Chao
@ 2019-09-11 23:09 ` Richard Henderson
1 sibling, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2019-09-11 23:09 UTC (permalink / raw)
To: liuzhiwei, Alistair.Francis, palmer, sagark, kbastian,
riku.voipio, laurent, wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768
> +void VECTOR_HELPER(vsetvl)(CPURISCVState *env, uint32_t rs1, uint32_t rs2,
> + uint32_t rd)
> +{
> + int sew, max_sew, vlmax, vl;
> +
> + if (rs2 == 0) {
> + vector_vtype_set_ill(env);
> + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
> + return;
> + }
I don't see that vsetvl, rs2 == r0 should raise SIGILL.
Is that requirement new, after the 0.7.1 specification?
If so, this should happen in the translator and not here.
You should *not* change cpu state (setting vill here) before raising SIGILL.
As far as I can see "vsetvl rd, rs1, r0" == "vsetvli rd, rs1, e8".
> + env->vfp.vtype = env->gpr[rs2];
You should pass the rs2 register by value, not by index.
> + sew = 1 << vector_get_width(env) / 8;
> + max_sew = sizeof(target_ulong);
> +
> + if (env->misa & RVD) {
> + max_sew = max_sew > 8 ? max_sew : 8;
> + } else if (env->misa & RVF) {
> + max_sew = max_sew > 4 ? max_sew : 4;
> + }
> + if (sew > max_sew) {
> + vector_vtype_set_ill(env);
> + return;
> + }
> +
> + vlmax = vector_get_vlmax(env);
> + if (rs1 == 0) {
> + vl = vlmax;
> + } else if (env->gpr[rs1] <= vlmax) {
> + vl = env->gpr[rs1];
> + } else if (env->gpr[rs1] < 2 * vlmax) {
> + vl = ceil(env->gpr[rs1] / 2);
> + } else {
> + vl = vlmax;
> + }
You should pass rs1 register by value, not by index.
The special case of rs1 == r0 can be handled by passing the value
(target_ulong)-1, which will match the final case above.
> + env->vfp.vl = vl;
> + env->gpr[rd] = vl;
> + env->vfp.vstart = 0;
> + return;
> +}
You should return vl and have it assigned to rd by the translator code, and not
assign it here.
> +void VECTOR_HELPER(vsetvli)(CPURISCVState *env, uint32_t rs1, uint32_t zimm,
> + uint32_t rd)
You should not require a separate helper function for this.
Passing the zimm constant as the value for rs2 above is the correct mapping
between the two instructions.
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 05/17] RISC-V: add vector extension load and store instructions
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (3 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 04/17] RISC-V: add vector extension configure instruction liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-12 14:23 ` Richard Henderson
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 06/17] RISC-V: add vector extension fault-only-first implementation liuzhiwei
` (12 subsequent siblings)
17 siblings, 1 reply; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/helper.h | 37 +
target/riscv/insn32.decode | 46 +
target/riscv/insn_trans/trans_rvv.inc.c | 70 +
target/riscv/vector_helper.c | 2638 +++++++++++++++++++++++++++++++
4 files changed, 2791 insertions(+)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 652f8c3..f77c392 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -77,5 +77,42 @@ DEF_HELPER_1(wfi, void, env)
DEF_HELPER_1(tlb_flush, void, env)
#endif
/* Vector functions */
+DEF_HELPER_5(vector_vlb_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vlh_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vlw_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vle_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vlbu_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vlhu_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vlwu_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsb_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsh_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsw_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vse_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlsb_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlsh_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlsw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlse_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlsbu_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlshu_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlswu_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vssb_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vssh_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vssw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vsse_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlxb_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlxh_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlxw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlxe_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlxbu_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlxhu_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vlxwu_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vsxb_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vsxh_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vsxw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vsxe_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vsuxb_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vsuxh_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vsuxw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vsuxe_v, void, env, i32, i32, i32, i32, i32)
DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 5dc009c..b8a3d8a 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -25,6 +25,7 @@
%sh10 20:10
%csr 20:12
%rm 12:3
+%nf 29:3
# immediates:
%imm_i 20:s12
@@ -62,6 +63,8 @@
@r_rm ....... ..... ..... ... ..... ....... %rs2 %rs1 %rm %rd
@r2_rm ....... ..... ..... ... ..... ....... %rs1 %rm %rd
@r2 ....... ..... ..... ... ..... ....... %rs1 %rd
+@r_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd
+@r2_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs1 %rd
@r2_zimm . zimm:11 ..... ... ..... ....... %rs1 %rd
@sfence_vma ....... ..... ..... ... ..... ....... %rs2 %rs1
@@ -206,5 +209,48 @@ fcvt_d_w 1101001 00000 ..... ... ..... 1010011 @r2_rm
fcvt_d_wu 1101001 00001 ..... ... ..... 1010011 @r2_rm
# *** RV32V Extension ***
+
+# *** Vector loads and stores are encoded within LOADFP/STORE-FP ***
+vlb_v ... 100 . 00000 ..... 000 ..... 0000111 @r2_nfvm
+vlh_v ... 100 . 00000 ..... 101 ..... 0000111 @r2_nfvm
+vlw_v ... 100 . 00000 ..... 110 ..... 0000111 @r2_nfvm
+vle_v ... 000 . 00000 ..... 111 ..... 0000111 @r2_nfvm
+vlbu_v ... 000 . 00000 ..... 000 ..... 0000111 @r2_nfvm
+vlhu_v ... 000 . 00000 ..... 101 ..... 0000111 @r2_nfvm
+vlwu_v ... 000 . 00000 ..... 110 ..... 0000111 @r2_nfvm
+vsb_v ... 000 . 00000 ..... 000 ..... 0100111 @r2_nfvm
+vsh_v ... 000 . 00000 ..... 101 ..... 0100111 @r2_nfvm
+vsw_v ... 000 . 00000 ..... 110 ..... 0100111 @r2_nfvm
+vse_v ... 000 . 00000 ..... 111 ..... 0100111 @r2_nfvm
+
+vlsb_v ... 110 . ..... ..... 000 ..... 0000111 @r_nfvm
+vlsh_v ... 110 . ..... ..... 101 ..... 0000111 @r_nfvm
+vlsw_v ... 110 . ..... ..... 110 ..... 0000111 @r_nfvm
+vlse_v ... 010 . ..... ..... 111 ..... 0000111 @r_nfvm
+vlsbu_v ... 010 . ..... ..... 000 ..... 0000111 @r_nfvm
+vlshu_v ... 010 . ..... ..... 101 ..... 0000111 @r_nfvm
+vlswu_v ... 010 . ..... ..... 110 ..... 0000111 @r_nfvm
+vssb_v ... 010 . ..... ..... 000 ..... 0100111 @r_nfvm
+vssh_v ... 010 . ..... ..... 101 ..... 0100111 @r_nfvm
+vssw_v ... 010 . ..... ..... 110 ..... 0100111 @r_nfvm
+vsse_v ... 010 . ..... ..... 111 ..... 0100111 @r_nfvm
+
+vlxb_v ... 111 . ..... ..... 000 ..... 0000111 @r_nfvm
+vlxh_v ... 111 . ..... ..... 101 ..... 0000111 @r_nfvm
+vlxw_v ... 111 . ..... ..... 110 ..... 0000111 @r_nfvm
+vlxe_v ... 011 . ..... ..... 111 ..... 0000111 @r_nfvm
+vlxbu_v ... 011 . ..... ..... 000 ..... 0000111 @r_nfvm
+vlxhu_v ... 011 . ..... ..... 101 ..... 0000111 @r_nfvm
+vlxwu_v ... 011 . ..... ..... 110 ..... 0000111 @r_nfvm
+vsxb_v ... 011 . ..... ..... 000 ..... 0100111 @r_nfvm
+vsxh_v ... 011 . ..... ..... 101 ..... 0100111 @r_nfvm
+vsxw_v ... 011 . ..... ..... 110 ..... 0100111 @r_nfvm
+vsxe_v ... 011 . ..... ..... 111 ..... 0100111 @r_nfvm
+vsuxb_v ... 111 . ..... ..... 000 ..... 0100111 @r_nfvm
+vsuxh_v ... 111 . ..... ..... 101 ..... 0100111 @r_nfvm
+vsuxw_v ... 111 . ..... ..... 110 ..... 0100111 @r_nfvm
+vsuxe_v ... 111 . ..... ..... 111 ..... 0100111 @r_nfvm
+
+#*** new major opcode OP-V ***
vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index 82e7ad6..16b1f90 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -16,6 +16,37 @@
* this program. If not, see <http://www.gnu.org/licenses/>.
*/
+#define GEN_VECTOR_R2_NFVM(INSN) \
+static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
+{ \
+ TCGv_i32 s1 = tcg_const_i32(a->rs1); \
+ TCGv_i32 d = tcg_const_i32(a->rd); \
+ TCGv_i32 nf = tcg_const_i32(a->nf); \
+ TCGv_i32 vm = tcg_const_i32(a->vm); \
+ gen_helper_vector_##INSN(cpu_env, nf, vm, s1, d); \
+ tcg_temp_free_i32(s1); \
+ tcg_temp_free_i32(d); \
+ tcg_temp_free_i32(nf); \
+ tcg_temp_free_i32(vm); \
+ return true; \
+}
+#define GEN_VECTOR_R_NFVM(INSN) \
+static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
+{ \
+ TCGv_i32 s1 = tcg_const_i32(a->rs1); \
+ TCGv_i32 s2 = tcg_const_i32(a->rs2); \
+ TCGv_i32 d = tcg_const_i32(a->rd); \
+ TCGv_i32 nf = tcg_const_i32(a->nf); \
+ TCGv_i32 vm = tcg_const_i32(a->vm); \
+ gen_helper_vector_##INSN(cpu_env, nf, vm, s1, s2, d);\
+ tcg_temp_free_i32(s1); \
+ tcg_temp_free_i32(s2); \
+ tcg_temp_free_i32(d); \
+ tcg_temp_free_i32(nf); \
+ tcg_temp_free_i32(vm); \
+ return true; \
+}
+
#define GEN_VECTOR_R(INSN) \
static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
{ \
@@ -42,5 +73,44 @@ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
return true; \
}
+GEN_VECTOR_R2_NFVM(vlb_v)
+GEN_VECTOR_R2_NFVM(vlh_v)
+GEN_VECTOR_R2_NFVM(vlw_v)
+GEN_VECTOR_R2_NFVM(vle_v)
+GEN_VECTOR_R2_NFVM(vlbu_v)
+GEN_VECTOR_R2_NFVM(vlhu_v)
+GEN_VECTOR_R2_NFVM(vlwu_v)
+GEN_VECTOR_R2_NFVM(vsb_v)
+GEN_VECTOR_R2_NFVM(vsh_v)
+GEN_VECTOR_R2_NFVM(vsw_v)
+GEN_VECTOR_R2_NFVM(vse_v)
+
+GEN_VECTOR_R_NFVM(vlsb_v)
+GEN_VECTOR_R_NFVM(vlsh_v)
+GEN_VECTOR_R_NFVM(vlsw_v)
+GEN_VECTOR_R_NFVM(vlse_v)
+GEN_VECTOR_R_NFVM(vlsbu_v)
+GEN_VECTOR_R_NFVM(vlshu_v)
+GEN_VECTOR_R_NFVM(vlswu_v)
+GEN_VECTOR_R_NFVM(vssb_v)
+GEN_VECTOR_R_NFVM(vssh_v)
+GEN_VECTOR_R_NFVM(vssw_v)
+GEN_VECTOR_R_NFVM(vsse_v)
+GEN_VECTOR_R_NFVM(vlxb_v)
+GEN_VECTOR_R_NFVM(vlxh_v)
+GEN_VECTOR_R_NFVM(vlxw_v)
+GEN_VECTOR_R_NFVM(vlxe_v)
+GEN_VECTOR_R_NFVM(vlxbu_v)
+GEN_VECTOR_R_NFVM(vlxhu_v)
+GEN_VECTOR_R_NFVM(vlxwu_v)
+GEN_VECTOR_R_NFVM(vsxb_v)
+GEN_VECTOR_R_NFVM(vsxh_v)
+GEN_VECTOR_R_NFVM(vsxw_v)
+GEN_VECTOR_R_NFVM(vsxe_v)
+GEN_VECTOR_R_NFVM(vsuxb_v)
+GEN_VECTOR_R_NFVM(vsuxh_v)
+GEN_VECTOR_R_NFVM(vsuxw_v)
+GEN_VECTOR_R_NFVM(vsuxe_v)
+
GEN_VECTOR_R2_ZIMM(vsetvli)
GEN_VECTOR_R(vsetvl)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index b279e6f..62e4d2e 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -20,10 +20,60 @@
#include "cpu.h"
#include "exec/exec-all.h"
#include "exec/helper-proto.h"
+#include "exec/cpu_ldst.h"
#include <math.h>
#define VECTOR_HELPER(name) HELPER(glue(vector_, name))
+static int64_t sign_extend(int64_t a, int8_t width)
+{
+ return a << (64 - width) >> (64 - width);
+}
+
+static target_ulong vector_get_index(CPURISCVState *env, int rs1, int rs2,
+ int index, int mem, int width, int nf)
+{
+ target_ulong abs_off, base = env->gpr[rs1];
+ target_long offset;
+ switch (width) {
+ case 8:
+ offset = sign_extend(env->vfp.vreg[rs2].s8[index], 8) + nf * mem;
+ break;
+ case 16:
+ offset = sign_extend(env->vfp.vreg[rs2].s16[index], 16) + nf * mem;
+ break;
+ case 32:
+ offset = sign_extend(env->vfp.vreg[rs2].s32[index], 32) + nf * mem;
+ break;
+ case 64:
+ offset = env->vfp.vreg[rs2].s64[index] + nf * mem;
+ break;
+ default:
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ return 0;
+ }
+ if (offset < 0) {
+ abs_off = ~offset + 1;
+ if (base >= abs_off) {
+ return base - abs_off;
+ }
+ } else {
+ if ((target_ulong)((target_ulong)offset + base) >= base) {
+ return (target_ulong)offset + base;
+ }
+ }
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ return 0;
+}
+
+static inline bool vector_vtype_ill(CPURISCVState *env)
+{
+ if ((env->vfp.vtype >> (sizeof(target_ulong) - 1)) & 0x1) {
+ return true;
+ }
+ return false;
+}
+
static inline void vector_vtype_set_ill(CPURISCVState *env)
{
env->vfp.vtype = ((target_ulong)1) << (sizeof(target_ulong) - 1);
@@ -50,6 +100,76 @@ static inline int vector_get_vlmax(CPURISCVState *env)
return vector_get_lmul(env) * VLEN / vector_get_width(env);
}
+static inline int vector_elem_mask(CPURISCVState *env, uint32_t vm, int width,
+ int lmul, int index)
+{
+ int mlen = width / lmul;
+ int idx = (index * mlen) / 8;
+ int pos = (index * mlen) % 8;
+
+ return vm || ((env->vfp.vreg[0].u8[idx] >> pos) & 0x1);
+}
+
+static inline bool vector_overlap_vm_common(int lmul, int vm, int rd)
+{
+ if (lmul > 1 && vm == 0 && rd == 0) {
+ return true;
+ }
+ return false;
+}
+
+static bool vector_lmul_check_reg(CPURISCVState *env, uint32_t lmul,
+ uint32_t reg, bool widen)
+{
+ int legal = widen ? (lmul * 2) : lmul;
+
+ if ((lmul != 1 && lmul != 2 && lmul != 4 && lmul != 8) ||
+ (lmul == 8 && widen)) {
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ return false;
+ }
+
+ if (reg % legal != 0) {
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ return false;
+ }
+ return true;
+}
+
+static void vector_tail_segment(CPURISCVState *env, int vreg, int index,
+ int width, int nf, int lmul)
+{
+ switch (width) {
+ case 8:
+ while (nf >= 0) {
+ env->vfp.vreg[vreg + nf * lmul].u8[index] = 0;
+ nf--;
+ }
+ break;
+ case 16:
+ while (nf >= 0) {
+ env->vfp.vreg[vreg + nf * lmul].u16[index] = 0;
+ nf--;
+ }
+ break;
+ case 32:
+ while (nf >= 0) {
+ env->vfp.vreg[vreg + nf * lmul].u32[index] = 0;
+ nf--;
+ }
+ break;
+ case 64:
+ while (nf >= 0) {
+ env->vfp.vreg[vreg + nf * lmul].u64[index] = 0;
+ nf--;
+ }
+ break;
+ default:
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ return;
+ }
+}
+
void VECTOR_HELPER(vsetvl)(CPURISCVState *env, uint32_t rs1, uint32_t rs2,
uint32_t rd)
{
@@ -124,3 +244,2521 @@ void VECTOR_HELPER(vsetvli)(CPURISCVState *env, uint32_t rs1, uint32_t zimm,
env->vfp.vstart = 0;
return;
}
+
+void VECTOR_HELPER(vlbu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].u8[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].u16[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].s8[j] =
+ cpu_ldsb_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].s16[j] = sign_extend(
+ cpu_ldsb_data(env, env->gpr[rs1] + read), 8);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend(
+ cpu_ldsb_data(env, env->gpr[rs1] + read), 8);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend(
+ cpu_ldsb_data(env, env->gpr[rs1] + read), 8);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlsbu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k;
+ env->vfp.vreg[dest + k * lmul].u8[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k;
+ env->vfp.vreg[dest + k * lmul].u16[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k;
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k;
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlsb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k;
+ env->vfp.vreg[dest + k * lmul].s8[j] =
+ cpu_ldsb_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k;
+ env->vfp.vreg[dest + k * lmul].s16[j] = sign_extend(
+ cpu_ldsb_data(env, env->gpr[rs1] + read), 8);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k;
+ env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend(
+ cpu_ldsb_data(env, env->gpr[rs1] + read), 8);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k;
+ env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend(
+ cpu_ldsb_data(env, env->gpr[rs1] + read), 8);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlxbu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, src2;
+ target_ulong addr;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ env->vfp.vreg[dest + k * lmul].u8[j] =
+ cpu_ldub_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ env->vfp.vreg[dest + k * lmul].u16[j] =
+ cpu_ldub_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_ldub_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_ldub_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlxb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, src2;
+ target_ulong addr;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ env->vfp.vreg[dest + k * lmul].s8[j] =
+ cpu_ldsb_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ env->vfp.vreg[dest + k * lmul].s16[j] = sign_extend(
+ cpu_ldsb_data(env, addr), 8);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend(
+ cpu_ldsb_data(env, addr), 8);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend(
+ cpu_ldsb_data(env, addr), 8);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlhu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].u16[j] =
+ cpu_lduw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_lduw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_lduw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].s16[j] =
+ cpu_ldsw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend(
+ cpu_ldsw_data(env, env->gpr[rs1] + read), 16);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend(
+ cpu_ldsw_data(env, env->gpr[rs1] + read), 16);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlshu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k * 2;
+ env->vfp.vreg[dest + k * lmul].u16[j] =
+ cpu_lduw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k * 2;
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_lduw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k * 2;
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_lduw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlsh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k * 2;
+ env->vfp.vreg[dest + k * lmul].s16[j] =
+ cpu_ldsw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k * 2;
+ env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend(
+ cpu_ldsw_data(env, env->gpr[rs1] + read), 16);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k * 2;
+ env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend(
+ cpu_ldsw_data(env, env->gpr[rs1] + read), 16);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlxhu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, src2;
+ target_ulong addr;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 2, width, k);
+ env->vfp.vreg[dest + k * lmul].u16[j] =
+ cpu_lduw_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 2, width, k);
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_lduw_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 2, width, k);
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_lduw_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlxh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, src2;
+ target_ulong addr;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 2, width, k);
+ env->vfp.vreg[dest + k * lmul].s16[j] =
+ cpu_ldsw_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 2, width, k);
+ env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend(
+ cpu_ldsw_data(env, addr), 16);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 2, width, k);
+ env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend(
+ cpu_ldsw_data(env, addr), 16);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 4;
+ env->vfp.vreg[dest + k * lmul].s32[j] =
+ cpu_ldl_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 4;
+ env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend(
+ cpu_ldl_data(env, env->gpr[rs1] + read), 32);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlwu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 4;
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_ldl_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 4;
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_ldl_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlswu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k * 4;
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_ldl_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k * 4;
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_ldl_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlsw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k * 4;
+ env->vfp.vreg[dest + k * lmul].s32[j] =
+ cpu_ldl_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k * 4;
+ env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend(
+ cpu_ldl_data(env, env->gpr[rs1] + read), 32);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlxwu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, src2;
+ target_ulong addr;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 4, width, k);
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_ldl_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 4, width, k);
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_ldl_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlxw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, src2;
+ target_ulong addr;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 4, width, k);
+ env->vfp.vreg[dest + k * lmul].s32[j] =
+ cpu_ldl_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 4, width, k);
+ env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend(
+ cpu_ldl_data(env, addr), 32);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vle_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].u8[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].u16[j] =
+ cpu_lduw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 4;
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_ldl_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 8;
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_ldq_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlse_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k;
+ env->vfp.vreg[dest + k * lmul].u8[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k * 2;
+ env->vfp.vreg[dest + k * lmul].u16[j] =
+ cpu_lduw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k * 4;
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_ldl_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * env->gpr[rs2] + k * 8;
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_ldq_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlxe_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, src2;
+ target_ulong addr;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ env->vfp.vreg[dest + k * lmul].u8[j] =
+ cpu_ldub_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 2, width, k);
+ env->vfp.vreg[dest + k * lmul].u16[j] =
+ cpu_lduw_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 4, width, k);
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_ldl_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 8, width, k);
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_ldq_data(env, addr);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, wrote;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * (nf + 1) + k;
+ cpu_stb_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s8[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * (nf + 1) + k;
+ cpu_stb_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s16[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * (nf + 1) + k;
+ cpu_stb_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s32[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * (nf + 1) + k;
+ cpu_stb_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s64[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vssb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, wrote;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * env->gpr[rs2] + k;
+ cpu_stb_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s8[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * env->gpr[rs2] + k;
+ cpu_stb_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s16[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * env->gpr[rs2] + k;
+ cpu_stb_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s32[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * env->gpr[rs2] + k;
+ cpu_stb_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s64[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsxb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, src2;
+ target_ulong addr;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ cpu_stb_data(env, addr,
+ env->vfp.vreg[dest + k * lmul].s8[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ cpu_stb_data(env, addr,
+ env->vfp.vreg[dest + k * lmul].s16[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ cpu_stb_data(env, addr,
+ env->vfp.vreg[dest + k * lmul].s32[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ cpu_stb_data(env, addr,
+ env->vfp.vreg[dest + k * lmul].s64[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsuxb_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ return VECTOR_HELPER(vsxb_v)(env, nf, vm, rs1, rs2, rd);
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, wrote;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = (i * (nf + 1) + k) * 2;
+ cpu_stw_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s16[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = (i * (nf + 1) + k) * 2;
+ cpu_stw_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s32[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = (i * (nf + 1) + k) * 2;
+ cpu_stw_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s64[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vssh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, wrote;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * env->gpr[rs2] + k * 2;
+ cpu_stw_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s16[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * env->gpr[rs2] + k * 2;
+ cpu_stw_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s32[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * env->gpr[rs2] + k * 2;
+ cpu_stw_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s64[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsxh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, src2;
+ target_ulong addr;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 2, width, k);
+ cpu_stw_data(env, addr,
+ env->vfp.vreg[dest + k * lmul].s16[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 2, width, k);
+ cpu_stw_data(env, addr,
+ env->vfp.vreg[dest + k * lmul].s32[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 2, width, k);
+ cpu_stw_data(env, addr,
+ env->vfp.vreg[dest + k * lmul].s64[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsuxh_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ return VECTOR_HELPER(vsxh_v)(env, nf, vm, rs1, rs2, rd);
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, wrote;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = (i * (nf + 1) + k) * 4;
+ cpu_stl_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s32[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = (i * (nf + 1) + k) * 4;
+ cpu_stl_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s64[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vssw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, wrote;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * env->gpr[rs2] + k * 4;
+ cpu_stl_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s32[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * env->gpr[rs2] + k * 4;
+ cpu_stl_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s64[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsxw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, src2;
+ target_ulong addr;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 4, width, k);
+ cpu_stl_data(env, addr,
+ env->vfp.vreg[dest + k * lmul].s32[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 4, width, k);
+ cpu_stl_data(env, addr,
+ env->vfp.vreg[dest + k * lmul].s64[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsuxw_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ return VECTOR_HELPER(vsxw_v)(env, nf, vm, rs1, rs2, rd);
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vse_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, wrote;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * (nf + 1) + k;
+ cpu_stb_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s8[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = (i * (nf + 1) + k) * 2;
+ cpu_stw_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s16[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = (i * (nf + 1) + k) * 4;
+ cpu_stl_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s32[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = (i * (nf + 1) + k) * 8;
+ cpu_stq_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s64[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsse_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, wrote;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * env->gpr[rs2] + k;
+ cpu_stb_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s8[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * env->gpr[rs2] + k * 2;
+ cpu_stw_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s16[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * env->gpr[rs2] + k * 4;
+ cpu_stl_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s32[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ wrote = i * env->gpr[rs2] + k * 8;
+ cpu_stq_data(env, env->gpr[rs1] + wrote,
+ env->vfp.vreg[dest + k * lmul].s64[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsxe_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, src2;
+ target_ulong addr;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 1, width, k);
+ cpu_stb_data(env, addr,
+ env->vfp.vreg[dest + k * lmul].s8[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 2, width, k);
+ cpu_stw_data(env, addr,
+ env->vfp.vreg[dest + k * lmul].s16[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 4, width, k);
+ cpu_stl_data(env, addr,
+ env->vfp.vreg[dest + k * lmul].s32[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ addr = vector_get_index(env, rs1, src2, j, 8, width, k);
+ cpu_stq_data(env, addr,
+ env->vfp.vreg[dest + k * lmul].s64[j]);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsuxe_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ return VECTOR_HELPER(vsxe_v)(env, nf, vm, rs1, rs2, rd);
+ env->vfp.vstart = 0;
+}
+
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 05/17] RISC-V: add vector extension load and store instructions
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 05/17] RISC-V: add vector extension load and store instructions liuzhiwei
@ 2019-09-12 14:23 ` Richard Henderson
2020-01-08 1:32 ` LIU Zhiwei
0 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-09-12 14:23 UTC (permalink / raw)
To: liuzhiwei, Alistair.Francis, palmer, sagark, kbastian,
riku.voipio, laurent, wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768
> +static bool vector_lmul_check_reg(CPURISCVState *env, uint32_t lmul,
> + uint32_t reg, bool widen)
> +{
> + int legal = widen ? (lmul * 2) : lmul;
> +
> + if ((lmul != 1 && lmul != 2 && lmul != 4 && lmul != 8) ||
> + (lmul == 8 && widen)) {
> + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
> + return false;
> + }
> +
> + if (reg % legal != 0) {
> + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
> + return false;
> + }
> + return true;
> +}
These exceptions will not do the right thing.
You cannot call helper_raise_exception from another helper, or from something
called from another helper, as here. You need to use riscv_raise_exception, as
you do elsewhere in this patch, with a GETPC() value passed down from the
outermost helper.
Ideally you would check these conditions at translate time.
I've mentioned how to do this in reply to your v1.
> +void VECTOR_HELPER(vlbu_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,> + uint32_t rs1, uint32_t rd)
You should pass the rs1 register by value, not by index.
> +{> + int i, j, k, vl, vlmax, lmul, width, dest, read;> +> + vl =
env->vfp.vl;> +> + lmul = vector_get_lmul(env);> + width =
vector_get_width(env);> + vlmax = vector_get_vlmax(env);> +> + if
(vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {> +
riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());> +
return;> + }> + if (lmul * (nf + 1) > 32) {> +
riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());> +
return;> + }
Again, these exceptions should ideally be identified at translate time.
I also think that you should have at least two different helpers: one that
checks the vector mask and one that doesn't. If you check the above conditions
at translate time then you'll also want to split the helpers based on element
width.
You could also meaningfully split nf == 0 vs nf != 0. You will, in any case,
need to check at translate time whether the Zvlsseg extension is enabled before
allowing nf != 0.
> +
> + vector_lmul_check_reg(env, lmul, rd, false);
> +
> + for (i = 0; i < vlmax; i++) {
> + dest = rd + (i / (VLEN / width));
> + j = i % (VLEN / width);
This division is exactly why I suggested making vreg[] one contiguous array of
elements instead of a two-dimensional array. I think the distinction of 32
VLEN-sized registers should be reserved for cpu dumps and gdbstub.
> + k = nf;
> + if (i < env->vfp.vstart) {
> + continue;
Surely you should hoist this check outside the loop.
> + } else if (i < vl) {
> + switch (width) {
> + case 8:
> + if (vector_elem_mask(env, vm, width, lmul, i)) {
> + while (k >= 0) {
> + read = i * (nf + 1) + k;
> + env->vfp.vreg[dest + k * lmul].u8[j] =
> + cpu_ldub_data(env, env->gpr[rs1] + read);
You must not modify vreg[x] before you've recognized all possible exceptions,
e.g. validating that a subsequent access will not trigger a page fault.
Otherwise you will have a partially modified register value when the exception
handler is entered.
Without a stride, and without a predicate mask, this can be done with at most
two calls to probe_access (one per page). This is the simplification that
makes splitting the helper into two very helpful.
With a stride or with a predicate mask requires either
(1) temporary storage for the loads, and copy back to env at the end, or
(2) use probe_access for each load, and then perform the actual loads directly
into env.
FWIW, ARM SVE uses (1), as probe_access is very new.
> + k--;
> + }
> + env->vfp.vstart++;
> + }
> + break;
> + case 16:
> + if (vector_elem_mask(env, vm, width, lmul, i)) {
> + while (k >= 0) {
> + read = i * (nf + 1) + k;
> + env->vfp.vreg[dest + k * lmul].u16[j] =
> + cpu_ldub_data(env, env->gpr[rs1] + read);
I don't see anything in these assignments to vreg[x].uN[y] that take the
endianness of the host into account.
You need to think about how the architecture defines the overlap of elements --
particularly across vlset -- and make adjustments.
I can imagine, if you have explicit tests for this, your tests are passing
because the architecture defines a little-endian based indexing of the register
file, and you have only run tests on a little-endian host, like x86_64.
For ARM, we define the representation as a little-endian indexed array of
host-endian uint64_t. This means that a big-endian host needs to adjust the
address of any element smaller than 64-bit. E.g.
#ifdef HOST_WORDS_BIGENDIAN
#define H1(x) ((x) ^ 7)
#define H2(x) ((x) ^ 3)
#define H4(x) ((x) ^ 1)
#else
#define H1(x) (x)
#define H2(x) (x)
#define H4(x) (x)
#endif
env->vfp.vreg[reg + k * lmul].u16[H2(j)]
> + case 64:
> + if (vector_elem_mask(env, vm, width, lmul, i)) {
> + while (k >= 0) {
> + read = i * (nf + 1) + k;
> + env->vfp.vreg[dest + k * lmul].u64[j] =
> + cpu_ldub_data(env, env->gpr[rs1] + read);
> + k--;
> + }
> + env->vfp.vstart++;
> + }
> + break;
> + default:
> + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
Ideally, this condition is detected at translate time.
You must detect this condition before making any changes to cpu state.
Moreover, the SIGILL should not be skipped because of VSTART.
> +static target_ulong vector_get_index(CPURISCVState *env, int rs1, int rs2,
> + int index, int mem, int width, int nf)
> +{
> + target_ulong abs_off, base = env->gpr[rs1];
You should be passing rs1 by value, not by index.
> + target_long offset;
> + switch (width) {
> + case 8:
> + offset = sign_extend(env->vfp.vreg[rs2].s8[index], 8) + nf * mem;
> + break;
> + case 16:
> + offset = sign_extend(env->vfp.vreg[rs2].s16[index], 16) + nf * mem;
> + break;
> + case 32:
> + offset = sign_extend(env->vfp.vreg[rs2].s32[index], 32) + nf * mem;
> + break;
> + case 64:
> + offset = env->vfp.vreg[rs2].s64[index] + nf * mem;
> + break;
> + default:
> + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
> + return 0;
> + }
> + if (offset < 0) {
> + abs_off = ~offset + 1;
You have been hanging around hardware people too much.
In software we normally write this "-offset". ;-)
> + if (base >= abs_off) {
> + return base - abs_off;
> + }
> + } else {
> + if ((target_ulong)((target_ulong)offset + base) >= base) {
> + return (target_ulong)offset + base;
> + }
> + }
Why all the extra casting here? They are exactly what is implied by C.
> + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
> + return 0;
(1) This exception call won't work, as above,
(2) Where does this condition against wraparound come from?
I don't see it in the specification.
(3) You certainly cannot detect this after having written a
previous element to the register file.
[ Skipping lots of functions that are basically the same. ]
> +void VECTOR_HELPER(vsxe_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
> + uint32_t rs1, uint32_t rs2, uint32_t rd)
Pass rs1 by value.
> + case 8:
> + if (vector_elem_mask(env, vm, width, lmul, i)) {
> + while (k >= 0) {
> + addr = vector_get_index(env, rs1, src2, j, 1, width, k);
> + cpu_stb_data(env, addr,
> + env->vfp.vreg[dest + k * lmul].s8[j]);
Must probe_access all of the memory before any stores.
Unlike loads, you don't have the option of storing into a temporary.
Which suggests a common subroutine to perform the probe(s), rather
than bother with a temporary for loads.
> +void VECTOR_HELPER(vsuxe_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
> + uint32_t rs1, uint32_t rs2, uint32_t rd)
> +{
> + return VECTOR_HELPER(vsxe_v)(env, nf, vm, rs1, rs2, rd);
You can't do this and expect the GETPC() for the exceptions raised by vsxe_v to
operate properly. You must define a common helper function and pass in
GETPC(), or preferably not have this second helper function at all. There's no
reason why you cannot call vsxe_v for implementing vsuxe_v. It's merely
laziness within the macros you set up in trans_rvv.inc.c.
> + env->vfp.vstart = 0;
> +}
Dead code after the return, in any case.
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 05/17] RISC-V: add vector extension load and store instructions
2019-09-12 14:23 ` Richard Henderson
@ 2020-01-08 1:32 ` LIU Zhiwei
2020-01-08 2:08 ` Richard Henderson
0 siblings, 1 reply; 43+ messages in thread
From: LIU Zhiwei @ 2020-01-08 1:32 UTC (permalink / raw)
To: Richard Henderson, Alistair.Francis, palmer, Chih-Min Chao
Cc: wenmeng_zhang, qemu-riscv, qemu-devel, wxy194768, Jim Wilson
[-- Attachment #1: Type: text/plain, Size: 6639 bytes --]
Hi Richard,
Sorry to reply so late for this comment. I will move forward on part 2.
On 2019/9/12 22:23, Richard Henderson wrote:
>> +static bool vector_lmul_check_reg(CPURISCVState *env, uint32_t lmul,
>> + uint32_t reg, bool widen)
>> +{
>> + int legal = widen ? (lmul * 2) : lmul;
>> +
>> + if ((lmul != 1 && lmul != 2 && lmul != 4 && lmul != 8) ||
>> + (lmul == 8 && widen)) {
>> + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
>> + return false;
>> + }
>> +
>> + if (reg % legal != 0) {
>> + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
>> + return false;
>> + }
>> + return true;
>> +}
> These exceptions will not do the right thing.
>
> You cannot call helper_raise_exception from another helper, or from something
> called from another helper, as here. You need to use riscv_raise_exception, as
> you do elsewhere in this patch, with a GETPC() value passed down from the
> outermost helper.
>
> Ideally you would check these conditions at translate time.
> I've mentioned how to do this in reply to your v1.
As discussed in part1, I will check these conditions at translate time.
>> + } else if (i < vl) {
>> + switch (width) {
>> + case 8:
>> + if (vector_elem_mask(env, vm, width, lmul, i)) {
>> + while (k >= 0) {
>> + read = i * (nf + 1) + k;
>> + env->vfp.vreg[dest + k * lmul].u8[j] =
>> + cpu_ldub_data(env, env->gpr[rs1] + read);
> You must not modify vreg[x] before you've recognized all possible exceptions,
> e.g. validating that a subsequent access will not trigger a page fault.
> Otherwise you will have a partially modified register value when the exception
> handler is entered.
There are two questions here.
1) How to validate access before real access to registers?
As pointed in another comment for patchset v1,
"instructions that perform more than one host store must probe
the entire range to be stored before performing any stores.
"
I didn't see the validation of page in SVE, for example, sve_st1_r,
which directly use the helper_ret_*_mmu that may cause an page fault
exception or ovelap a watchpoint,
before probe the entire range to be stored .
2) Why not use the cpu_ld* API?
I see in SVE that ld*_p is used to directly access the host memory. And
helper_ret_*_mmu
is used to access guest memory. But from the definition of cpu_ld*, it's
the combination of
ld*_p and helper_ret_*_mmu.
entry = tlb_entry(env, mmu_idx, addr);
if (unlikely(entry->ADDR_READ !=
(addr & (TARGET_PAGE_MASK | (DATA_SIZE - 1))))) {
oi = make_memop_idx(SHIFT, mmu_idx);
res = glue(glue(helper_ret_ld, URETSUFFIX), MMUSUFFIX)(env, addr,
oi, retaddr);
} else {
uintptr_t hostaddr = addr + entry->addend;
res = glue(glue(ld, USUFFIX), _p)((uint8_t *)hostaddr);
}
So I don't know why not use cpu_ld* API?
> Without a stride, and without a predicate mask, this can be done with at most
> two calls to probe_access (one per page). This is the simplification that
> makes splitting the helper into two very helpful.
>
> With a stride or with a predicate mask requires either
> (1) temporary storage for the loads, and copy back to env at the end, or
> (2) use probe_access for each load, and then perform the actual loads directly
> into env.
>
> FWIW, ARM SVE uses (1), as probe_access is very new.
>
>> + k--;
>> + }
>> + env->vfp.vstart++;
>> + }
>> + break;
>> + case 16:
>> + if (vector_elem_mask(env, vm, width, lmul, i)) {
>> + while (k >= 0) {
>> + read = i * (nf + 1) + k;
>> + env->vfp.vreg[dest + k * lmul].u16[j] =
>> + cpu_ldub_data(env, env->gpr[rs1] + read);
> I don't see anything in these assignments to vreg[x].uN[y] that take the
> endianness of the host into account.
>
> You need to think about how the architecture defines the overlap of elements --
> particularly across vlset -- and make adjustments.
>
> I can imagine, if you have explicit tests for this, your tests are passing
> because the architecture defines a little-endian based indexing of the register
> file, and you have only run tests on a little-endian host, like x86_64.
>
> For ARM, we define the representation as a little-endian indexed array of
> host-endian uint64_t. This means that a big-endian host needs to adjust the
> address of any element smaller than 64-bit. E.g.
>
> #ifdef HOST_WORDS_BIGENDIAN
> #define H1(x) ((x) ^ 7)
> #define H2(x) ((x) ^ 3)
> #define H4(x) ((x) ^ 1)
> #else
> #define H1(x) (x)
> #define H2(x) (x)
> #define H4(x) (x)
> #endif
>
> env->vfp.vreg[reg + k * lmul].u16[H2(j)]
>
I will take it. However I didn't have a big-endian host to test the
feature.
>
>> + if (base >= abs_off) {
>> + return base - abs_off;
>> + }
>> + } else {
>> + if ((target_ulong)((target_ulong)offset + base) >= base) {
>> + return (target_ulong)offset + base;
>> + }
>> + }
> Why all the extra casting here? They are exactly what is implied by C.
>
>> + helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
>> + return 0;
> (1) This exception call won't work, as above,
> (2) Where does this condition against wraparound come from?
> I don't see it in the specification.
> (3) You certainly cannot detect this after having written a
> previous element to the register file.
>
> [ Skipping lots of functions that are basically the same. ]
>
>> +void VECTOR_HELPER(vsxe_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
>> + uint32_t rs1, uint32_t rs2, uint32_t rd)
> Pass rs1 by value.
>
>> + case 8:
>> + if (vector_elem_mask(env, vm, width, lmul, i)) {
>> + while (k >= 0) {
>> + addr = vector_get_index(env, rs1, src2, j, 1, width, k);
>> + cpu_stb_data(env, addr,
>> + env->vfp.vreg[dest + k * lmul].s8[j]);
> Must probe_access all of the memory before any stores.
> Unlike loads, you don't have the option of storing into a temporary.
> Which suggests a common subroutine to perform the probe(s), rather
> than bother with a temporary for loads.
>
> r~
Thanks again for your informative comments.
Best Regards,
Zhiwei
[-- Attachment #2: Type: text/html, Size: 9361 bytes --]
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 05/17] RISC-V: add vector extension load and store instructions
2020-01-08 1:32 ` LIU Zhiwei
@ 2020-01-08 2:08 ` Richard Henderson
0 siblings, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2020-01-08 2:08 UTC (permalink / raw)
To: LIU Zhiwei, Alistair.Francis, palmer, Chih-Min Chao
Cc: wenmeng_zhang, qemu-riscv, qemu-devel, wxy194768, Jim Wilson
On 1/8/20 11:32 AM, LIU Zhiwei wrote:
>>> + switch (width) {
>>> + case 8:
>>> + if (vector_elem_mask(env, vm, width, lmul, i)) {
>>> + while (k >= 0) {
>>> + read = i * (nf + 1) + k;
>>> + env->vfp.vreg[dest + k * lmul].u8[j] =
>>> + cpu_ldub_data(env, env->gpr[rs1] + read);
>> You must not modify vreg[x] before you've recognized all possible exceptions,
>> e.g. validating that a subsequent access will not trigger a page fault.
>> Otherwise you will have a partially modified register value when the exception
>> handler is entered.
> There are two questions here.
>
> 1) How to validate access before real access to registers?
>
> As pointed in another comment for patchset v1,
>
> "instructions that perform more than one host store must probe
> the entire range to be stored before performing any stores.
> "
Use probe_access (or one of the probe_write/probe_read helpers).
Ideally one would then use the result, which is a host address, and perform
direct loads/stores using that. The result may be null, indicating that the
operation needs the i/o path. But in any case, after the probe we are
guaranteed that the page is mapped and readable/writable.
Note that probe_* does not allow [addr, addr+size) to cross a page boundary.
So you do have to be prepared for the vector operation to consist of 2 pages,
and probe both of them.
> I didn't see the validation of page in SVE, for example, sve_st1_r,
> which directly use the helper_ret_*_mmu that may cause an page fault
> exception or ovelap a watchpoint,
> before probe the entire range to be stored .
Yes, this is a bug in SVE that will be fixed.
Note that you should not use helper_ret_* anymore. I've just introduced
cpu_{ld,st}*_mmuidx_ra() that should be used instead.
> 2) Why not use the cpu_ld* API?
It's possible to use cpu_ld*, but then you need to store the results into a
temporary, and copy the result to the register afterward.
But I think it's better to probe first and avoid a second copy.
> I see in SVE that ld*_p is used to directly access the host memory. And
> helper_ret_*_mmu
> is used to access guest memory. But from the definition of cpu_ld*, it's the
> combination of
> ld*_p and helper_ret_*_mmu.
This is all changed now, FWIW.
> I will take it. However I didn't have a big-endian host to test the feature.
You can apply for a gcc compile farm account, and then you will have access to
ppc64 big-endian hosts.
https://cfarm.tetaneutral.net/users/new/
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 06/17] RISC-V: add vector extension fault-only-first implementation
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (4 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 05/17] RISC-V: add vector extension load and store instructions liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-12 14:32 ` Richard Henderson
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 07/17] RISC-V: add vector extension atomic instructions liuzhiwei
` (11 subsequent siblings)
17 siblings, 1 reply; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
linux-user/riscv/cpu_loop.c | 7 +
target/riscv/cpu_helper.c | 7 +
target/riscv/helper.h | 7 +
target/riscv/insn32.decode | 7 +
target/riscv/insn_trans/trans_rvv.inc.c | 7 +
target/riscv/vector_helper.c | 567 ++++++++++++++++++++++++++++++++
6 files changed, 602 insertions(+)
diff --git a/linux-user/riscv/cpu_loop.c b/linux-user/riscv/cpu_loop.c
index 12aa3c0..d673fa5 100644
--- a/linux-user/riscv/cpu_loop.c
+++ b/linux-user/riscv/cpu_loop.c
@@ -41,6 +41,13 @@ void cpu_loop(CPURISCVState *env)
sigcode = 0;
sigaddr = 0;
+ if (env->foflag) {
+ if (env->vfp.vl != 0) {
+ env->foflag = false;
+ env->pc += 4;
+ continue;
+ }
+ }
switch (trapnr) {
case EXCP_INTERRUPT:
/* just indicate that signals should be handled asap */
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index e32b612..405caf6 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -521,6 +521,13 @@ void riscv_cpu_do_interrupt(CPUState *cs)
[PRV_H] = RISCV_EXCP_H_ECALL,
[PRV_M] = RISCV_EXCP_M_ECALL
};
+ if (env->foflag) {
+ if (env->vfp.vl != 0) {
+ env->foflag = false;
+ env->pc += 4;
+ return;
+ }
+ }
if (!async) {
/* set tval to badaddr for traps with address information */
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index f77c392..973342f 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -84,6 +84,13 @@ DEF_HELPER_5(vector_vle_v, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vlbu_v, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vlhu_v, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vlwu_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vlbff_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vlhff_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vlwff_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vleff_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vlbuff_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vlhuff_v, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vlwuff_v, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vsb_v, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vsh_v, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vsw_v, void, env, i32, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index b8a3d8a..b286997 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -218,6 +218,13 @@ vle_v ... 000 . 00000 ..... 111 ..... 0000111 @r2_nfvm
vlbu_v ... 000 . 00000 ..... 000 ..... 0000111 @r2_nfvm
vlhu_v ... 000 . 00000 ..... 101 ..... 0000111 @r2_nfvm
vlwu_v ... 000 . 00000 ..... 110 ..... 0000111 @r2_nfvm
+vlbff_v ... 100 . 10000 ..... 000 ..... 0000111 @r2_nfvm
+vlhff_v ... 100 . 10000 ..... 101 ..... 0000111 @r2_nfvm
+vlwff_v ... 100 . 10000 ..... 110 ..... 0000111 @r2_nfvm
+vleff_v ... 000 . 10000 ..... 111 ..... 0000111 @r2_nfvm
+vlbuff_v ... 000 . 10000 ..... 000 ..... 0000111 @r2_nfvm
+vlhuff_v ... 000 . 10000 ..... 101 ..... 0000111 @r2_nfvm
+vlwuff_v ... 000 . 10000 ..... 110 ..... 0000111 @r2_nfvm
vsb_v ... 000 . 00000 ..... 000 ..... 0100111 @r2_nfvm
vsh_v ... 000 . 00000 ..... 101 ..... 0100111 @r2_nfvm
vsw_v ... 000 . 00000 ..... 110 ..... 0100111 @r2_nfvm
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index 16b1f90..bd83885 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -80,6 +80,13 @@ GEN_VECTOR_R2_NFVM(vle_v)
GEN_VECTOR_R2_NFVM(vlbu_v)
GEN_VECTOR_R2_NFVM(vlhu_v)
GEN_VECTOR_R2_NFVM(vlwu_v)
+GEN_VECTOR_R2_NFVM(vlbff_v)
+GEN_VECTOR_R2_NFVM(vlhff_v)
+GEN_VECTOR_R2_NFVM(vlwff_v)
+GEN_VECTOR_R2_NFVM(vleff_v)
+GEN_VECTOR_R2_NFVM(vlbuff_v)
+GEN_VECTOR_R2_NFVM(vlhuff_v)
+GEN_VECTOR_R2_NFVM(vlwuff_v)
GEN_VECTOR_R2_NFVM(vsb_v)
GEN_VECTOR_R2_NFVM(vsh_v)
GEN_VECTOR_R2_NFVM(vsw_v)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 62e4d2e..0ac8c74 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -2762,3 +2762,570 @@ void VECTOR_HELPER(vsuxe_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
env->vfp.vstart = 0;
}
+void VECTOR_HELPER(vlbuff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ env->foflag = true;
+ env->vfp.vl = 0;
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].u8[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].u16[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->foflag = false;
+ env->vfp.vl = vl;
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlbff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+ env->foflag = true;
+ env->vfp.vl = 0;
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].s8[j] =
+ cpu_ldsb_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].s16[j] = sign_extend(
+ cpu_ldsb_data(env, env->gpr[rs1] + read), 8);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend(
+ cpu_ldsb_data(env, env->gpr[rs1] + read), 8);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend(
+ cpu_ldsb_data(env, env->gpr[rs1] + read), 8);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->foflag = false;
+ env->vfp.vl = vl;
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlhuff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rd, false);
+ env->foflag = true;
+ env->vfp.vl = 0;
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].u16[j] =
+ cpu_lduw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_lduw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_lduw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->foflag = false;
+ env->vfp.vl = vl;
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlhff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rd, false);
+ env->foflag = true;
+ env->vfp.vl = 0;
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].s16[j] =
+ cpu_ldsw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].s32[j] = sign_extend(
+ cpu_ldsw_data(env, env->gpr[rs1] + read), 16);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend(
+ cpu_ldsw_data(env, env->gpr[rs1] + read), 16);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->vfp.vl = vl;
+ env->foflag = false;
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlwuff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rd, false);
+ env->foflag = true;
+ env->vfp.vl = 0;
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 4;
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_ldl_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 4;
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_ldl_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->foflag = false;
+ env->vfp.vl = vl;
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vlwff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rd, false);
+ env->foflag = true;
+ env->vfp.vl = 0;
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 4;
+ env->vfp.vreg[dest + k * lmul].s32[j] =
+ cpu_ldl_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 4;
+ env->vfp.vreg[dest + k * lmul].s64[j] = sign_extend(
+ cpu_ldl_data(env, env->gpr[rs1] + read), 32);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->foflag = false;
+ env->vfp.vl = vl;
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vleff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
+ uint32_t rs1, uint32_t rd)
+{
+ int i, j, k, vl, vlmax, lmul, width, dest, read;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (lmul * (nf + 1) > 32) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rd, false);
+ env->vfp.vl = 0;
+ env->foflag = true;
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = nf;
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = i * (nf + 1) + k;
+ env->vfp.vreg[dest + k * lmul].u8[j] =
+ cpu_ldub_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 2;
+ env->vfp.vreg[dest + k * lmul].u16[j] =
+ cpu_lduw_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 4;
+ env->vfp.vreg[dest + k * lmul].u32[j] =
+ cpu_ldl_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ while (k >= 0) {
+ read = (i * (nf + 1) + k) * 8;
+ env->vfp.vreg[dest + k * lmul].u64[j] =
+ cpu_ldq_data(env, env->gpr[rs1] + read);
+ k--;
+ }
+ env->vfp.vstart++;
+ }
+ env->vfp.vl++;
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_segment(env, dest, j, width, k, lmul);
+ }
+ }
+ env->foflag = false;
+ env->vfp.vl = vl;
+ env->vfp.vstart = 0;
+}
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 06/17] RISC-V: add vector extension fault-only-first implementation
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 06/17] RISC-V: add vector extension fault-only-first implementation liuzhiwei
@ 2019-09-12 14:32 ` Richard Henderson
0 siblings, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2019-09-12 14:32 UTC (permalink / raw)
To: liuzhiwei, Alistair.Francis, palmer, sagark, kbastian,
riku.voipio, laurent, wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768
On 9/11/19 2:25 AM, liuzhiwei wrote:
> diff --git a/linux-user/riscv/cpu_loop.c b/linux-user/riscv/cpu_loop.c
> index 12aa3c0..d673fa5 100644
> --- a/linux-user/riscv/cpu_loop.c
> +++ b/linux-user/riscv/cpu_loop.c
> @@ -41,6 +41,13 @@ void cpu_loop(CPURISCVState *env)
> sigcode = 0;
> sigaddr = 0;
>
> + if (env->foflag) {
> + if (env->vfp.vl != 0) {
> + env->foflag = false;
> + env->pc += 4;
> + continue;
> + }
> + }
> switch (trapnr) {
> case EXCP_INTERRUPT:
> /* just indicate that signals should be handled asap */
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index e32b612..405caf6 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -521,6 +521,13 @@ void riscv_cpu_do_interrupt(CPUState *cs)
> [PRV_H] = RISCV_EXCP_H_ECALL,
> [PRV_M] = RISCV_EXCP_M_ECALL
> };
> + if (env->foflag) {
> + if (env->vfp.vl != 0) {
> + env->foflag = false;
> + env->pc += 4;
> + return;
> + }
> + }
I renew my objection to this FOFLAG mechanism. I believe, but have no proof,
that this will race between different types of interrupts. Once again I
present the ARM SVE first-fault helpers as proof that there is another way.
Otherwise, all of the same comments from the normal loads apply.
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 07/17] RISC-V: add vector extension atomic instructions
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (5 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 06/17] RISC-V: add vector extension fault-only-first implementation liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-12 14:57 ` Richard Henderson
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 08/17] RISC-V: add vector extension integer instructions part1, add/sub/adc/sbc liuzhiwei
` (10 subsequent siblings)
17 siblings, 1 reply; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/helper.h | 18 +
target/riscv/insn32.decode | 21 +
target/riscv/insn_trans/trans_rvv.inc.c | 36 +
target/riscv/vector_helper.c | 1467 +++++++++++++++++++++++++++++++
4 files changed, 1542 insertions(+)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 973342f..c107925 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -121,5 +121,23 @@ DEF_HELPER_6(vector_vsuxb_v, void, env, i32, i32, i32, i32, i32)
DEF_HELPER_6(vector_vsuxh_v, void, env, i32, i32, i32, i32, i32)
DEF_HELPER_6(vector_vsuxw_v, void, env, i32, i32, i32, i32, i32)
DEF_HELPER_6(vector_vsuxe_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamoswapw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamoswapd_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamoaddw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamoaddd_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamoxorw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamoxord_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamoandw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamoandd_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamoorw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamoord_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamominw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamomind_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamomaxw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamomaxd_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamominuw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamominud_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamomaxuw_v, void, env, i32, i32, i32, i32, i32)
+DEF_HELPER_6(vector_vamomaxud_v, void, env, i32, i32, i32, i32, i32)
DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index b286997..48e7661 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -63,6 +63,7 @@
@r_rm ....... ..... ..... ... ..... ....... %rs2 %rs1 %rm %rd
@r2_rm ....... ..... ..... ... ..... ....... %rs1 %rm %rd
@r2 ....... ..... ..... ... ..... ....... %rs1 %rd
+@r_wdvm ..... wd:1 vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd
@r_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd
@r2_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs1 %rd
@r2_zimm . zimm:11 ..... ... ..... ....... %rs1 %rd
@@ -258,6 +259,26 @@ vsuxh_v ... 111 . ..... ..... 101 ..... 0100111 @r_nfvm
vsuxw_v ... 111 . ..... ..... 110 ..... 0100111 @r_nfvm
vsuxe_v ... 111 . ..... ..... 111 ..... 0100111 @r_nfvm
+#*** Vector AMO operations are encoded under the standard AMO major opcode.***
+vamoswapw_v 00001 . . ..... ..... 110 ..... 0101111 @r_wdvm
+vamoswapd_v 00001 . . ..... ..... 111 ..... 0101111 @r_wdvm
+vamoaddw_v 00000 . . ..... ..... 110 ..... 0101111 @r_wdvm
+vamoaddd_v 00000 . . ..... ..... 111 ..... 0101111 @r_wdvm
+vamoxorw_v 00100 . . ..... ..... 110 ..... 0101111 @r_wdvm
+vamoxord_v 00100 . . ..... ..... 111 ..... 0101111 @r_wdvm
+vamoandw_v 01100 . . ..... ..... 110 ..... 0101111 @r_wdvm
+vamoandd_v 01100 . . ..... ..... 111 ..... 0101111 @r_wdvm
+vamoorw_v 01000 . . ..... ..... 110 ..... 0101111 @r_wdvm
+vamoord_v 01000 . . ..... ..... 111 ..... 0101111 @r_wdvm
+vamominw_v 10000 . . ..... ..... 110 ..... 0101111 @r_wdvm
+vamomind_v 10000 . . ..... ..... 111 ..... 0101111 @r_wdvm
+vamomaxw_v 10100 . . ..... ..... 110 ..... 0101111 @r_wdvm
+vamomaxd_v 10100 . . ..... ..... 111 ..... 0101111 @r_wdvm
+vamominuw_v 11000 . . ..... ..... 110 ..... 0101111 @r_wdvm
+vamominud_v 11000 . . ..... ..... 111 ..... 0101111 @r_wdvm
+vamomaxuw_v 11100 . . ..... ..... 110 ..... 0101111 @r_wdvm
+vamomaxud_v 11100 . . ..... ..... 111 ..... 0101111 @r_wdvm
+
#*** new major opcode OP-V ***
vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index bd83885..7bda378 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -47,6 +47,23 @@ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
return true; \
}
+#define GEN_VECTOR_R_WDVM(INSN) \
+static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
+{ \
+ TCGv_i32 s1 = tcg_const_i32(a->rs1); \
+ TCGv_i32 s2 = tcg_const_i32(a->rs2); \
+ TCGv_i32 d = tcg_const_i32(a->rd); \
+ TCGv_i32 wd = tcg_const_i32(a->wd); \
+ TCGv_i32 vm = tcg_const_i32(a->vm); \
+ gen_helper_vector_##INSN(cpu_env, wd, vm, s1, s2, d);\
+ tcg_temp_free_i32(s1); \
+ tcg_temp_free_i32(s2); \
+ tcg_temp_free_i32(d); \
+ tcg_temp_free_i32(wd); \
+ tcg_temp_free_i32(vm); \
+ return true; \
+}
+
#define GEN_VECTOR_R(INSN) \
static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
{ \
@@ -119,5 +136,24 @@ GEN_VECTOR_R_NFVM(vsuxh_v)
GEN_VECTOR_R_NFVM(vsuxw_v)
GEN_VECTOR_R_NFVM(vsuxe_v)
+GEN_VECTOR_R_WDVM(vamoswapw_v)
+GEN_VECTOR_R_WDVM(vamoswapd_v)
+GEN_VECTOR_R_WDVM(vamoaddw_v)
+GEN_VECTOR_R_WDVM(vamoaddd_v)
+GEN_VECTOR_R_WDVM(vamoxorw_v)
+GEN_VECTOR_R_WDVM(vamoxord_v)
+GEN_VECTOR_R_WDVM(vamoandw_v)
+GEN_VECTOR_R_WDVM(vamoandd_v)
+GEN_VECTOR_R_WDVM(vamoorw_v)
+GEN_VECTOR_R_WDVM(vamoord_v)
+GEN_VECTOR_R_WDVM(vamominw_v)
+GEN_VECTOR_R_WDVM(vamomind_v)
+GEN_VECTOR_R_WDVM(vamomaxw_v)
+GEN_VECTOR_R_WDVM(vamomaxd_v)
+GEN_VECTOR_R_WDVM(vamominuw_v)
+GEN_VECTOR_R_WDVM(vamominud_v)
+GEN_VECTOR_R_WDVM(vamomaxuw_v)
+GEN_VECTOR_R_WDVM(vamomaxud_v)
+
GEN_VECTOR_R2_ZIMM(vsetvli)
GEN_VECTOR_R(vsetvl)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 0ac8c74..9ebf70d 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -136,6 +136,21 @@ static bool vector_lmul_check_reg(CPURISCVState *env, uint32_t lmul,
return true;
}
+static void vector_tail_amo(CPURISCVState *env, int vreg, int index, int width)
+{
+ switch (width) {
+ case 32:
+ env->vfp.vreg[vreg].u32[index] = 0;
+ break;
+ case 64:
+ env->vfp.vreg[vreg].u64[index] = 0;
+ break;
+ default:
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ return;
+ }
+}
+
static void vector_tail_segment(CPURISCVState *env, int vreg, int index,
int width, int nf, int lmul)
{
@@ -3329,3 +3344,1455 @@ void VECTOR_HELPER(vleff_v)(CPURISCVState *env, uint32_t nf, uint32_t vm,
env->vfp.vl = vl;
env->vfp.vstart = 0;
}
+
+void VECTOR_HELPER(vamoswapw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TESL;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 32 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int32_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s32[j];
+ addr = idx + env->gpr[rs1];
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_xchgl_le(env, addr,
+ env->vfp.vreg[src3].s32[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_xchgl_le(env, addr,
+ env->vfp.vreg[src3].s32[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s32[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = (int64_t)(int32_t)helper_atomic_xchgl_le(env, addr,
+ env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = (int64_t)(int32_t)helper_atomic_xchgl_le(env, addr,
+ env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamoswapd_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TEQ;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 64 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_xchgq_le(env, addr,
+ env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_xchgq_le(env, addr,
+ env->vfp.vreg[src3].s64[j]);
+#endif
+
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamoaddw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TESL;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 32 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int32_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s32[j];
+ addr = idx + env->gpr[rs1];
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_addl_le(env, addr,
+ env->vfp.vreg[src3].s32[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_addl_le(env, addr,
+ env->vfp.vreg[src3].s32[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s32[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_addl_le(env,
+ addr, env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_addl_le(env,
+ addr, env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vamoaddd_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TEQ;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 64 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_addq_le(env, addr,
+ env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_addq_le(env, addr,
+ env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamoxorw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TESL;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 32 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int32_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s32[j];
+ addr = idx + env->gpr[rs1];
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_xorl_le(env, addr,
+ env->vfp.vreg[src3].s32[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_xorl_le(env, addr,
+ env->vfp.vreg[src3].s32[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s32[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_xorl_le(env,
+ addr, env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_xorl_le(env,
+ addr, env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamoxord_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TESL;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 64 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_xorq_le(env, addr,
+ env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_xorq_le(env, addr,
+ env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamoandw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TESL;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 32 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int32_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s32[j];
+ addr = idx + env->gpr[rs1];
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_andl_le(env, addr,
+ env->vfp.vreg[src3].s32[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_andl_le(env, addr,
+ env->vfp.vreg[src3].s32[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s32[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_andl_le(env,
+ addr, env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_andl_le(env,
+ addr, env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamoandd_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TEQ;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 64 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_andq_le(env, addr,
+ env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_andq_le(env, addr,
+ env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamoorw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TESL;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 32 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int32_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s32[j];
+ addr = idx + env->gpr[rs1];
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_orl_le(env, addr,
+ env->vfp.vreg[src3].s32[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_orl_le(env, addr,
+ env->vfp.vreg[src3].s32[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s32[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_orl_le(env,
+ addr, env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_orl_le(env,
+ addr, env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamoord_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TEQ;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 64 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_orq_le(env, addr,
+ env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_orq_le(env, addr,
+ env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamominw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TESL;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 32 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int32_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s32[j];
+ addr = idx + env->gpr[rs1];
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_sminl_le(env, addr,
+ env->vfp.vreg[src3].s32[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_sminl_le(env, addr,
+ env->vfp.vreg[src3].s32[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s32[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_sminl_le(env,
+ addr, env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_sminl_le(env,
+ addr, env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamomind_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TEQ;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 64 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_sminq_le(env, addr,
+ env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_sminq_le(env, addr,
+ env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamomaxw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TESL;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 32 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int32_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s32[j];
+ addr = idx + env->gpr[rs1];
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_smaxl_le(env, addr,
+ env->vfp.vreg[src3].s32[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_smaxl_le(env, addr,
+ env->vfp.vreg[src3].s32[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s32[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_smaxl_le(env,
+ addr, env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_smaxl_le(env,
+ addr, env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamomaxd_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TEQ;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 64 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ int64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_smaxq_le(env, addr,
+ env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_smaxq_le(env, addr,
+ env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamominuw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TESL;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 32 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ uint32_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s32[j];
+ addr = idx + env->gpr[rs1];
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_uminl_le(env, addr,
+ env->vfp.vreg[src3].s32[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_uminl_le(env, addr,
+ env->vfp.vreg[src3].s32[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s32[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ uint64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_uminl_le(
+ env, addr, env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_uminl_le(
+ env, addr, env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamominud_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TESL;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 64 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ uint32_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s32[j];
+ addr = idx + env->gpr[rs1];
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_uminl_le(env, addr,
+ env->vfp.vreg[src3].s32[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_uminl_le(env, addr,
+ env->vfp.vreg[src3].s32[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s32[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ uint64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_uminq_le(
+ env, addr, env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_uminq_le(env, addr,
+ env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vamomaxuw_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TESL;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 32 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ uint32_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s32[j];
+ addr = idx + env->gpr[rs1];
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_umaxl_le(env, addr,
+ env->vfp.vreg[src3].s32[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_umaxl_le(env, addr,
+ env->vfp.vreg[src3].s32[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s32[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ uint64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_umaxl_le(
+ env, addr, env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = (int64_t)(int32_t)helper_atomic_fetch_umaxl_le(
+ env, addr, env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vamomaxud_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
+ uint32_t rs1, uint32_t vs2, uint32_t vs3)
+{
+ int i, j, vl;
+ target_long idx;
+ uint32_t lmul, width, src2, src3, vlmax;
+ target_ulong addr;
+#ifdef CONFIG_SOFTMMU
+ int mem_idx = cpu_mmu_index(env, false);
+ TCGMemOp memop = MO_ALIGN | MO_TEQ;
+#endif
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ /* MEM <= SEW <= XLEN */
+ if (width < 64 || (width > sizeof(target_ulong) * 8)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* if wd, rd is writen the old value */
+ if (vector_vtype_ill(env) ||
+ (vector_overlap_vm_common(lmul, vm, vs3) && wd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, vs2, false);
+ vector_lmul_check_reg(env, lmul, vs3, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = vs2 + (i / (VLEN / width));
+ src3 = vs3 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ uint64_t tmp;
+ idx = (target_long)env->vfp.vreg[src2].s64[j];
+ addr = idx + env->gpr[rs1];
+
+#ifdef CONFIG_SOFTMMU
+ tmp = helper_atomic_fetch_umaxq_le(
+ env, addr, env->vfp.vreg[src3].s64[j],
+ make_memop_idx(memop & ~MO_SIGN, mem_idx));
+#else
+ tmp = helper_atomic_fetch_umaxq_le(env, addr,
+ env->vfp.vreg[src3].s64[j]);
+#endif
+ if (wd) {
+ env->vfp.vreg[src3].s64[j] = tmp;
+ }
+ env->vfp.vstart++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_amo(env, src3, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 07/17] RISC-V: add vector extension atomic instructions
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 07/17] RISC-V: add vector extension atomic instructions liuzhiwei
@ 2019-09-12 14:57 ` Richard Henderson
0 siblings, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2019-09-12 14:57 UTC (permalink / raw)
To: liuzhiwei, Alistair.Francis, palmer, sagark, kbastian,
riku.voipio, laurent, wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768
On 9/11/19 2:25 AM, liuzhiwei wrote:
> + case 64:
> + if (vector_elem_mask(env, vm, width, lmul, i)) {
> + int64_t tmp;
> + idx = (target_long)env->vfp.vreg[src2].s64[j];
> + addr = idx + env->gpr[rs1];
> +
> +#ifdef CONFIG_SOFTMMU
> + tmp = (int64_t)(int32_t)helper_atomic_xchgl_le(env, addr,
> + env->vfp.vreg[src3].s64[j],
> + make_memop_idx(memop & ~MO_SIGN, mem_idx));
> +#else
> + tmp = (int64_t)(int32_t)helper_atomic_xchgl_le(env, addr,
> + env->vfp.vreg[src3].s64[j]);
> +#endif
> + if (wd) {
> + env->vfp.vreg[src3].s64[j] = tmp;
> + }
> + env->vfp.vstart++;
> + }
> + break;
This will not link if !defined(CONFIG_ATOMIC64).
That's pretty rare these days, admittedly. I think you'd need to compile for
ppc32 or mips32 (or riscv32!) host to see this. You can force this condition
for i686 host with --extra-cflags='-march=i486', just to see if you've got it
right.
There should be two different versions of this helper: one that performs actual
atomic operations, as above, and a second that performs the same operation with
non-atomic operations.
The version of the helper that you call should be based on the translation time
setting of "tb_cflags(s->base.tb) & CF_PARALLEL": If PARALLEL is set, call the
atomic helper otherwise the non-atomic helper.
If you arrive at a situation in which the host cannot handle any atomic
operation, then you must raise the EXCP_ATOMIC exception. This will halt all
other cpus and run one instruction on this cpu while holding the exclusive lock.
If you cannot detect this condition any earlier than here at runtime, use
cpu_loop_exit_atomic(), but you must do so before altering any cpu state.
However, as per my comments for normal loads, you should be able to detect this
condition at translation time and call gen_helper_exit_atomic().
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 08/17] RISC-V: add vector extension integer instructions part1, add/sub/adc/sbc
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (6 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 07/17] RISC-V: add vector extension atomic instructions liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-12 15:27 ` Richard Henderson
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 09/17] RISC-V: add vector extension integer instructions part2, bit/shift liuzhiwei
` (9 subsequent siblings)
17 siblings, 1 reply; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/helper.h | 36 +
target/riscv/insn32.decode | 35 +
target/riscv/insn_trans/trans_rvv.inc.c | 49 +
target/riscv/vector_helper.c | 2335 +++++++++++++++++++++++++++++++
4 files changed, 2455 insertions(+)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index c107925..31e20dc 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -121,6 +121,7 @@ DEF_HELPER_6(vector_vsuxb_v, void, env, i32, i32, i32, i32, i32)
DEF_HELPER_6(vector_vsuxh_v, void, env, i32, i32, i32, i32, i32)
DEF_HELPER_6(vector_vsuxw_v, void, env, i32, i32, i32, i32, i32)
DEF_HELPER_6(vector_vsuxe_v, void, env, i32, i32, i32, i32, i32)
+
DEF_HELPER_6(vector_vamoswapw_v, void, env, i32, i32, i32, i32, i32)
DEF_HELPER_6(vector_vamoswapd_v, void, env, i32, i32, i32, i32, i32)
DEF_HELPER_6(vector_vamoaddw_v, void, env, i32, i32, i32, i32, i32)
@@ -139,5 +140,40 @@ DEF_HELPER_6(vector_vamominuw_v, void, env, i32, i32, i32, i32, i32)
DEF_HELPER_6(vector_vamominud_v, void, env, i32, i32, i32, i32, i32)
DEF_HELPER_6(vector_vamomaxuw_v, void, env, i32, i32, i32, i32, i32)
DEF_HELPER_6(vector_vamomaxud_v, void, env, i32, i32, i32, i32, i32)
+
+DEF_HELPER_4(vector_vadc_vvm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vadc_vxm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vadc_vim, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmadc_vvm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmadc_vxm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmadc_vim, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vsbc_vvm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vsbc_vxm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmsbc_vvm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmsbc_vxm, void, env, i32, i32, i32)
+DEF_HELPER_5(vector_vadd_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vadd_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vadd_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsub_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsub_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vrsub_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vrsub_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwaddu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwaddu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwadd_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwadd_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsubu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsubu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsub_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsub_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwaddu_wv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwaddu_wx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwadd_wv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwadd_wx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsubu_wv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsubu_wx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsub_wv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsub_wx, void, env, i32, i32, i32, i32)
+
DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 48e7661..fc7e498 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -63,6 +63,7 @@
@r_rm ....... ..... ..... ... ..... ....... %rs2 %rs1 %rm %rd
@r2_rm ....... ..... ..... ... ..... ....... %rs1 %rm %rd
@r2 ....... ..... ..... ... ..... ....... %rs1 %rd
+@r_vm ...... vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd
@r_wdvm ..... wd:1 vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd
@r_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd
@r2_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs1 %rd
@@ -280,5 +281,39 @@ vamomaxuw_v 11100 . . ..... ..... 110 ..... 0101111 @r_wdvm
vamomaxud_v 11100 . . ..... ..... 111 ..... 0101111 @r_wdvm
#*** new major opcode OP-V ***
+vadd_vv 000000 . ..... ..... 000 ..... 1010111 @r_vm
+vadd_vx 000000 . ..... ..... 100 ..... 1010111 @r_vm
+vadd_vi 000000 . ..... ..... 011 ..... 1010111 @r_vm
+vsub_vv 000010 . ..... ..... 000 ..... 1010111 @r_vm
+vsub_vx 000010 . ..... ..... 100 ..... 1010111 @r_vm
+vrsub_vx 000011 . ..... ..... 100 ..... 1010111 @r_vm
+vrsub_vi 000011 . ..... ..... 011 ..... 1010111 @r_vm
+vwaddu_vv 110000 . ..... ..... 010 ..... 1010111 @r_vm
+vwaddu_vx 110000 . ..... ..... 110 ..... 1010111 @r_vm
+vwadd_vv 110001 . ..... ..... 010 ..... 1010111 @r_vm
+vwadd_vx 110001 . ..... ..... 110 ..... 1010111 @r_vm
+vwsubu_vv 110010 . ..... ..... 010 ..... 1010111 @r_vm
+vwsubu_vx 110010 . ..... ..... 110 ..... 1010111 @r_vm
+vwsub_vv 110011 . ..... ..... 010 ..... 1010111 @r_vm
+vwsub_vx 110011 . ..... ..... 110 ..... 1010111 @r_vm
+vwaddu_wv 110100 . ..... ..... 010 ..... 1010111 @r_vm
+vwaddu_wx 110100 . ..... ..... 110 ..... 1010111 @r_vm
+vwadd_wv 110101 . ..... ..... 010 ..... 1010111 @r_vm
+vwadd_wx 110101 . ..... ..... 110 ..... 1010111 @r_vm
+vwsubu_wv 110110 . ..... ..... 010 ..... 1010111 @r_vm
+vwsubu_wx 110110 . ..... ..... 110 ..... 1010111 @r_vm
+vwsub_wv 110111 . ..... ..... 010 ..... 1010111 @r_vm
+vwsub_wx 110111 . ..... ..... 110 ..... 1010111 @r_vm
+vadc_vvm 010000 1 ..... ..... 000 ..... 1010111 @r
+vadc_vxm 010000 1 ..... ..... 100 ..... 1010111 @r
+vadc_vim 010000 1 ..... ..... 011 ..... 1010111 @r
+vmadc_vvm 010001 1 ..... ..... 000 ..... 1010111 @r
+vmadc_vxm 010001 1 ..... ..... 100 ..... 1010111 @r
+vmadc_vim 010001 1 ..... ..... 011 ..... 1010111 @r
+vsbc_vvm 010010 1 ..... ..... 000 ..... 1010111 @r
+vsbc_vxm 010010 1 ..... ..... 100 ..... 1010111 @r
+vmsbc_vvm 010011 1 ..... ..... 000 ..... 1010111 @r
+vmsbc_vxm 010011 1 ..... ..... 100 ..... 1010111 @r
+
vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index 7bda378..a1c1960 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -77,6 +77,21 @@ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
return true; \
}
+#define GEN_VECTOR_R_VM(INSN) \
+static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
+{ \
+ TCGv_i32 s1 = tcg_const_i32(a->rs1); \
+ TCGv_i32 s2 = tcg_const_i32(a->rs2); \
+ TCGv_i32 d = tcg_const_i32(a->rd); \
+ TCGv_i32 vm = tcg_const_i32(a->vm); \
+ gen_helper_vector_##INSN(cpu_env, vm, s1, s2, d); \
+ tcg_temp_free_i32(s1); \
+ tcg_temp_free_i32(s2); \
+ tcg_temp_free_i32(d); \
+ tcg_temp_free_i32(vm); \
+ return true; \
+}
+
#define GEN_VECTOR_R2_ZIMM(INSN) \
static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
{ \
@@ -155,5 +170,39 @@ GEN_VECTOR_R_WDVM(vamominud_v)
GEN_VECTOR_R_WDVM(vamomaxuw_v)
GEN_VECTOR_R_WDVM(vamomaxud_v)
+GEN_VECTOR_R(vadc_vvm)
+GEN_VECTOR_R(vadc_vxm)
+GEN_VECTOR_R(vadc_vim)
+GEN_VECTOR_R(vmadc_vvm)
+GEN_VECTOR_R(vmadc_vxm)
+GEN_VECTOR_R(vmadc_vim)
+GEN_VECTOR_R(vsbc_vvm)
+GEN_VECTOR_R(vsbc_vxm)
+GEN_VECTOR_R(vmsbc_vvm)
+GEN_VECTOR_R(vmsbc_vxm)
+GEN_VECTOR_R_VM(vadd_vv)
+GEN_VECTOR_R_VM(vadd_vx)
+GEN_VECTOR_R_VM(vadd_vi)
+GEN_VECTOR_R_VM(vsub_vv)
+GEN_VECTOR_R_VM(vsub_vx)
+GEN_VECTOR_R_VM(vrsub_vx)
+GEN_VECTOR_R_VM(vrsub_vi)
+GEN_VECTOR_R_VM(vwaddu_vv)
+GEN_VECTOR_R_VM(vwaddu_vx)
+GEN_VECTOR_R_VM(vwadd_vv)
+GEN_VECTOR_R_VM(vwadd_vx)
+GEN_VECTOR_R_VM(vwsubu_vv)
+GEN_VECTOR_R_VM(vwsubu_vx)
+GEN_VECTOR_R_VM(vwsub_vv)
+GEN_VECTOR_R_VM(vwsub_vx)
+GEN_VECTOR_R_VM(vwaddu_wv)
+GEN_VECTOR_R_VM(vwaddu_wx)
+GEN_VECTOR_R_VM(vwadd_wv)
+GEN_VECTOR_R_VM(vwadd_wx)
+GEN_VECTOR_R_VM(vwsubu_wv)
+GEN_VECTOR_R_VM(vwsubu_wx)
+GEN_VECTOR_R_VM(vwsub_wv)
+GEN_VECTOR_R_VM(vwsub_wx)
+
GEN_VECTOR_R2_ZIMM(vsetvli)
GEN_VECTOR_R(vsetvl)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 9ebf70d..95336c9 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -24,12 +24,21 @@
#include <math.h>
#define VECTOR_HELPER(name) HELPER(glue(vector_, name))
+#define SIGNBIT8 (1 << 7)
+#define SIGNBIT16 (1 << 15)
+#define SIGNBIT32 (1 << 31)
+#define SIGNBIT64 ((uint64_t)1 << 63)
static int64_t sign_extend(int64_t a, int8_t width)
{
return a << (64 - width) >> (64 - width);
}
+static int64_t extend_gpr(target_ulong reg)
+{
+ return sign_extend(reg, sizeof(target_ulong) * 8);
+}
+
static target_ulong vector_get_index(CPURISCVState *env, int rs1, int rs2,
int index, int mem, int width, int nf)
{
@@ -118,6 +127,39 @@ static inline bool vector_overlap_vm_common(int lmul, int vm, int rd)
return false;
}
+static inline bool vector_overlap_vm_force(int vm, int rd)
+{
+ if (vm == 0 && rd == 0) {
+ return true;
+ }
+ return false;
+}
+
+static inline bool vector_overlap_carry(int lmul, int rd)
+{
+ if (lmul > 1 && rd == 0) {
+ return true;
+ }
+ return false;
+}
+
+static inline bool vector_overlap_dstgp_srcgp(int rd, int dlen, int rs,
+ int slen)
+{
+ if ((rd >= rs && rd < rs + slen) || (rs >= rd && rs < rd + dlen)) {
+ return true;
+ }
+ return false;
+}
+
+static inline void vector_get_layout(CPURISCVState *env, int width, int lmul,
+ int index, int *idx, int *pos)
+{
+ int mlen = width / lmul;
+ *idx = (index * mlen) / 8;
+ *pos = (index * mlen) % 8;
+}
+
static bool vector_lmul_check_reg(CPURISCVState *env, uint32_t lmul,
uint32_t reg, bool widen)
{
@@ -185,6 +227,173 @@ static void vector_tail_segment(CPURISCVState *env, int vreg, int index,
}
}
+static void vector_tail_common(CPURISCVState *env, int vreg, int index,
+ int width)
+{
+ switch (width) {
+ case 8:
+ env->vfp.vreg[vreg].u8[index] = 0;
+ break;
+ case 16:
+ env->vfp.vreg[vreg].u16[index] = 0;
+ break;
+ case 32:
+ env->vfp.vreg[vreg].u32[index] = 0;
+ break;
+ case 64:
+ env->vfp.vreg[vreg].u64[index] = 0;
+ break;
+ default:
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ return;
+ }
+}
+
+static void vector_tail_widen(CPURISCVState *env, int vreg, int index,
+ int width)
+{
+ switch (width) {
+ case 8:
+ env->vfp.vreg[vreg].u16[index] = 0;
+ break;
+ case 16:
+ env->vfp.vreg[vreg].u32[index] = 0;
+ break;
+ case 32:
+ env->vfp.vreg[vreg].u64[index] = 0;
+ break;
+ default:
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ return;
+ }
+}
+
+static inline int vector_get_carry(CPURISCVState *env, int width, int lmul,
+ int index)
+{
+ int mlen = width / lmul;
+ int idx = (index * mlen) / 8;
+ int pos = (index * mlen) % 8;
+
+ return (env->vfp.vreg[0].u8[idx] >> pos) & 0x1;
+}
+
+static inline void vector_mask_result(CPURISCVState *env, uint32_t reg,
+ int width, int lmul, int index, uint32_t result)
+{
+ int mlen = width / lmul;
+ int idx = (index * mlen) / width;
+ int pos = (index * mlen) % width;
+ uint64_t mask = ~((((uint64_t)1 << mlen) - 1) << pos);
+
+ switch (width) {
+ case 8:
+ env->vfp.vreg[reg].u8[idx] = (env->vfp.vreg[reg].u8[idx] & mask)
+ | (result << pos);
+ break;
+ case 16:
+ env->vfp.vreg[reg].u16[idx] = (env->vfp.vreg[reg].u16[idx] & mask)
+ | (result << pos);
+ break;
+ case 32:
+ env->vfp.vreg[reg].u32[idx] = (env->vfp.vreg[reg].u32[idx] & mask)
+ | (result << pos);
+ break;
+ case 64:
+ env->vfp.vreg[reg].u64[idx] = (env->vfp.vreg[reg].u64[idx] & mask)
+ | ((uint64_t)result << pos);
+ break;
+ default:
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ break;
+ }
+
+ return;
+}
+
+static inline uint64_t u64xu64_lh(uint64_t a, uint64_t b)
+{
+ uint64_t hi_64, carry;
+
+ /* first get the whole product in {hi_64, lo_64} */
+ uint64_t a_hi = a >> 32;
+ uint64_t a_lo = (uint32_t)a;
+ uint64_t b_hi = b >> 32;
+ uint64_t b_lo = (uint32_t)b;
+
+ /*
+ * a * b = (a_hi << 32 + a_lo) * (b_hi << 32 + b_lo)
+ * = (a_hi * b_hi) << 64 + (a_hi * b_lo) << 32 +
+ * (a_lo * b_hi) << 32 + a_lo * b_lo
+ * = {hi_64, lo_64}
+ * hi_64 = ((a_hi * b_lo) << 32 + (a_lo * b_hi) << 32 + (a_lo * b_lo)) >> 64
+ * = (a_hi * b_lo) >> 32 + (a_lo * b_hi) >> 32 + carry
+ * carry = ((uint64_t)(uint32_t)(a_hi * b_lo) +
+ * (uint64_t)(uint32_t)(a_lo * b_hi) + (a_lo * b_lo) >> 32) >> 32
+ */
+
+ carry = ((uint64_t)(uint32_t)(a_hi * b_lo) +
+ (uint64_t)(uint32_t)(a_lo * b_hi) +
+ ((a_lo * b_lo) >> 32)) >> 32;
+
+ hi_64 = a_hi * b_hi +
+ ((a_hi * b_lo) >> 32) + ((a_lo * b_hi) >> 32) +
+ carry;
+
+ return hi_64;
+}
+
+static inline int64_t s64xu64_lh(int64_t a, uint64_t b)
+{
+ uint64_t abs_a = a;
+ uint64_t lo_64, hi_64;
+
+ if (a < 0) {
+ abs_a = ~a + 1;
+ }
+ lo_64 = abs_a * b;
+ hi_64 = u64xu64_lh(abs_a, b);
+
+ if ((a ^ b) & SIGNBIT64) {
+ lo_64 = ~lo_64;
+ hi_64 = ~hi_64;
+ if (lo_64 == UINT64_MAX) {
+ lo_64 = 0;
+ hi_64 += 1;
+ } else {
+ lo_64 += 1;
+ }
+ }
+ return hi_64;
+}
+
+static inline int64_t s64xs64_lh(int64_t a, int64_t b)
+{
+ uint64_t abs_a = a, abs_b = b;
+ uint64_t lo_64, hi_64;
+
+ if (a < 0) {
+ abs_a = ~a + 1;
+ }
+ if (b < 0) {
+ abs_b = ~b + 1;
+ }
+ lo_64 = abs_a * abs_b;
+ hi_64 = u64xu64_lh(abs_a, abs_b);
+
+ if ((a ^ b) & SIGNBIT64) {
+ lo_64 = ~lo_64;
+ hi_64 = ~hi_64;
+ if (lo_64 == UINT64_MAX) {
+ lo_64 = 0;
+ hi_64 += 1;
+ } else {
+ lo_64 += 1;
+ }
+ }
+ return hi_64;
+}
+
void VECTOR_HELPER(vsetvl)(CPURISCVState *env, uint32_t rs1, uint32_t rs2,
uint32_t rd)
{
@@ -4796,3 +5005,2129 @@ void VECTOR_HELPER(vamomaxud_v)(CPURISCVState *env, uint32_t wd, uint32_t vm,
env->vfp.vstart = 0;
}
+void VECTOR_HELPER(vadc_vvm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax, carry;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_carry(lmul, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j]
+ + env->vfp.vreg[src2].u8[j] + carry;
+ break;
+ case 16:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j]
+ + env->vfp.vreg[src2].u16[j] + carry;
+ break;
+ case 32:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j]
+ + env->vfp.vreg[src2].u32[j] + carry;
+ break;
+ case 64:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src1].u64[j]
+ + env->vfp.vreg[src2].u64[j] + carry;
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vadc_vxm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax, carry;
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_carry(lmul, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u8[j] = env->gpr[rs1]
+ + env->vfp.vreg[src2].u8[j] + carry;
+ break;
+ case 16:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u16[j] = env->gpr[rs1]
+ + env->vfp.vreg[src2].u16[j] + carry;
+ break;
+ case 32:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u32[j] = env->gpr[rs1]
+ + env->vfp.vreg[src2].u32[j] + carry;
+ break;
+ case 64:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u64[j] = (uint64_t)extend_gpr(env->gpr[rs1])
+ + env->vfp.vreg[src2].u64[j] + carry;
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vadc_vim)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax, carry;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_carry(lmul, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u8[j] = sign_extend(rs1, 5)
+ + env->vfp.vreg[src2].u8[j] + carry;
+ break;
+ case 16:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u16[j] = sign_extend(rs1, 5)
+ + env->vfp.vreg[src2].u16[j] + carry;
+ break;
+ case 32:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u32[j] = sign_extend(rs1, 5)
+ + env->vfp.vreg[src2].u32[j] + carry;
+ break;
+ case 64:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u64[j] = sign_extend(rs1, 5)
+ + env->vfp.vreg[src2].u64[j] + carry;
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmadc_vvm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, vlmax, carry;
+ uint64_t tmp;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_dstgp_srcgp(rd, 1, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 1, rs2, lmul)
+ || (rd == 0)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = env->vfp.vreg[src1].u8[j]
+ + env->vfp.vreg[src2].u8[j] + carry;
+ tmp = tmp >> width;
+
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 16:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = env->vfp.vreg[src1].u16[j]
+ + env->vfp.vreg[src2].u16[j] + carry;
+ tmp = tmp >> width;
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 32:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = (uint64_t)env->vfp.vreg[src1].u32[j]
+ + (uint64_t)env->vfp.vreg[src2].u32[j] + carry;
+ tmp = tmp >> width;
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 64:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = env->vfp.vreg[src1].u64[j]
+ + env->vfp.vreg[src2].u64[j] + carry;
+
+ if ((tmp < env->vfp.vreg[src1].u64[j] ||
+ tmp < env->vfp.vreg[src2].u64[j])
+ || (env->vfp.vreg[src1].u64[j] == UINT64_MAX &&
+ env->vfp.vreg[src2].u64[j] == UINT64_MAX)) {
+ tmp = 1;
+ } else {
+ tmp = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmadc_vxm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax, carry;
+ uint64_t tmp, extend_rs1;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_dstgp_srcgp(rd, 1, rs2, lmul)
+ || (rd == 0)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = (uint8_t)env->gpr[rs1]
+ + env->vfp.vreg[src2].u8[j] + carry;
+ tmp = tmp >> width;
+
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 16:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = (uint16_t)env->gpr[rs1]
+ + env->vfp.vreg[src2].u16[j] + carry;
+ tmp = tmp >> width;
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 32:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = (uint64_t)((uint32_t)env->gpr[rs1])
+ + (uint64_t)env->vfp.vreg[src2].u32[j] + carry;
+ tmp = tmp >> width;
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 64:
+ carry = vector_get_carry(env, width, lmul, i);
+
+ extend_rs1 = (uint64_t)extend_gpr(env->gpr[rs1]);
+ tmp = extend_rs1 + env->vfp.vreg[src2].u64[j] + carry;
+ if ((tmp < extend_rs1) ||
+ (carry && (env->vfp.vreg[src2].u64[j] == UINT64_MAX))) {
+ tmp = 1;
+ } else {
+ tmp = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmadc_vim)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax, carry;
+ uint64_t tmp;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_dstgp_srcgp(rd, 1, rs2, lmul)
+ || (rd == 0)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = (uint8_t)sign_extend(rs1, 5)
+ + env->vfp.vreg[src2].u8[j] + carry;
+ tmp = tmp >> width;
+
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 16:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = (uint16_t)sign_extend(rs1, 5)
+ + env->vfp.vreg[src2].u16[j] + carry;
+ tmp = tmp >> width;
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 32:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = (uint64_t)((uint32_t)sign_extend(rs1, 5))
+ + (uint64_t)env->vfp.vreg[src2].u32[j] + carry;
+ tmp = tmp >> width;
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 64:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = (uint64_t)sign_extend(rs1, 5)
+ + env->vfp.vreg[src2].u64[j] + carry;
+
+ if ((tmp < (uint64_t)sign_extend(rs1, 5) ||
+ tmp < env->vfp.vreg[src2].u64[j])
+ || ((uint64_t)sign_extend(rs1, 5) == UINT64_MAX &&
+ env->vfp.vreg[src2].u64[j] == UINT64_MAX)) {
+ tmp = 1;
+ } else {
+ tmp = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsbc_vvm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax, carry;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_carry(lmul, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j]
+ - env->vfp.vreg[src1].u8[j] - carry;
+ break;
+ case 16:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ - env->vfp.vreg[src1].u16[j] - carry;
+ break;
+ case 32:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ - env->vfp.vreg[src1].u32[j] - carry;
+ break;
+ case 64:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ - env->vfp.vreg[src1].u64[j] - carry;
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vsbc_vxm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax, carry;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_carry(lmul, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j]
+ - env->gpr[rs1] - carry;
+ break;
+ case 16:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ - env->gpr[rs1] - carry;
+ break;
+ case 32:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ - env->gpr[rs1] - carry;
+ break;
+ case 64:
+ carry = vector_get_carry(env, width, lmul, i);
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ - (uint64_t)extend_gpr(env->gpr[rs1]) - carry;
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmsbc_vvm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, vlmax, carry;
+ uint64_t tmp;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_dstgp_srcgp(rd, 1, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 1, rs2, lmul)
+ || (rd == 0)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = env->vfp.vreg[src2].u8[j]
+ - env->vfp.vreg[src1].u8[j] - carry;
+ tmp = (tmp >> width) & 0x1;
+
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 16:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = env->vfp.vreg[src2].u16[j]
+ - env->vfp.vreg[src1].u16[j] - carry;
+ tmp = (tmp >> width) & 0x1;
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 32:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = (uint64_t)env->vfp.vreg[src2].u32[j]
+ - (uint64_t)env->vfp.vreg[src1].u32[j] - carry;
+ tmp = (tmp >> width) & 0x1;
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 64:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = env->vfp.vreg[src2].u64[j]
+ - env->vfp.vreg[src1].u64[j] - carry;
+
+ if (((env->vfp.vreg[src1].u64[j] == UINT64_MAX) && carry) ||
+ env->vfp.vreg[src2].u64[j] <
+ (env->vfp.vreg[src1].u64[j] + carry)) {
+ tmp = 1;
+ } else {
+ tmp = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmsbc_vxm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax, carry;
+ uint64_t tmp, extend_rs1;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_dstgp_srcgp(rd, 1, rs2, lmul)
+ || (rd == 0)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = env->vfp.vreg[src2].u8[j]
+ - (uint8_t)env->gpr[rs1] - carry;
+ tmp = (tmp >> width) & 0x1;
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 16:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = env->vfp.vreg[src2].u16[j]
+ - (uint16_t)env->gpr[rs1] - carry;
+ tmp = (tmp >> width) & 0x1;
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 32:
+ carry = vector_get_carry(env, width, lmul, i);
+ tmp = (uint64_t)env->vfp.vreg[src2].u32[j]
+ - (uint64_t)((uint32_t)env->gpr[rs1]) - carry;
+ tmp = (tmp >> width) & 0x1;
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+ case 64:
+ carry = vector_get_carry(env, width, lmul, i);
+
+ extend_rs1 = (uint64_t)extend_gpr(env->gpr[rs1]);
+ tmp = env->vfp.vreg[src2].u64[j] - extend_rs1 - carry;
+
+ if ((tmp > env->vfp.vreg[src2].u64[j]) ||
+ ((extend_rs1 == UINT64_MAX) && carry)) {
+ tmp = 1;
+ } else {
+ tmp = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ break;
+
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j]
+ + env->vfp.vreg[src2].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j]
+ + env->vfp.vreg[src2].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j]
+ + env->vfp.vreg[src2].u32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src1].u64[j]
+ + env->vfp.vreg[src2].u64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vadd_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->gpr[rs1]
+ + env->vfp.vreg[src2].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->gpr[rs1]
+ + env->vfp.vreg[src2].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->gpr[rs1]
+ + env->vfp.vreg[src2].u32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] =
+ (uint64_t)extend_gpr(env->gpr[rs1])
+ + env->vfp.vreg[src2].u64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vadd_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = sign_extend(rs1, 5)
+ + env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = sign_extend(rs1, 5)
+ + env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = sign_extend(rs1, 5)
+ + env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = sign_extend(rs1, 5)
+ + env->vfp.vreg[src2].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j]
+ - env->vfp.vreg[src1].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ - env->vfp.vreg[src1].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ - env->vfp.vreg[src1].u32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ - env->vfp.vreg[src1].u64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vsub_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j]
+ - env->gpr[rs1];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ - env->gpr[rs1];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ - env->gpr[rs1];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ - (uint64_t)extend_gpr(env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vrsub_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->gpr[rs1]
+ - env->vfp.vreg[src2].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->gpr[rs1]
+ - env->vfp.vreg[src2].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->gpr[rs1]
+ - env->vfp.vreg[src2].u32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] =
+ (uint64_t)extend_gpr(env->gpr[rs1])
+ - env->vfp.vreg[src2].u64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vrsub_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = sign_extend(rs1, 5)
+ - env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = sign_extend(rs1, 5)
+ - env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = sign_extend(rs1, 5)
+ - env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = sign_extend(rs1, 5)
+ - env->vfp.vreg[src2].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwaddu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)
+ ) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] =
+ (uint16_t)env->vfp.vreg[src1].u8[j] +
+ (uint16_t)env->vfp.vreg[src2].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] =
+ (uint32_t)env->vfp.vreg[src1].u16[j] +
+ (uint32_t)env->vfp.vreg[src2].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] =
+ (uint64_t)env->vfp.vreg[src1].u32[j] +
+ (uint64_t)env->vfp.vreg[src2].u32[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwaddu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)
+ ) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] =
+ (uint16_t)env->vfp.vreg[src2].u8[j] +
+ (uint16_t)((uint8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] =
+ (uint32_t)env->vfp.vreg[src2].u16[j] +
+ (uint32_t)((uint16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] =
+ (uint64_t)env->vfp.vreg[src2].u32[j] +
+ (uint64_t)((uint32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] =
+ (int16_t)env->vfp.vreg[src1].s8[j] +
+ (int16_t)env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] =
+ (int32_t)env->vfp.vreg[src1].s16[j] +
+ (int32_t)env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] =
+ (int64_t)env->vfp.vreg[src1].s32[j] +
+ (int64_t)env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwadd_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] =
+ (int16_t)((int8_t)env->vfp.vreg[src2].s8[j]) +
+ (int16_t)((int8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] =
+ (int32_t)((int16_t)env->vfp.vreg[src2].s16[j]) +
+ (int32_t)((int16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] =
+ (int64_t)((int32_t)env->vfp.vreg[src2].s32[j]) +
+ (int64_t)((int32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwsubu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)
+ ) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] =
+ (uint16_t)env->vfp.vreg[src2].u8[j] -
+ (uint16_t)env->vfp.vreg[src1].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] =
+ (uint32_t)env->vfp.vreg[src2].u16[j] -
+ (uint32_t)env->vfp.vreg[src1].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] =
+ (uint64_t)env->vfp.vreg[src2].u32[j] -
+ (uint64_t)env->vfp.vreg[src1].u32[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwsubu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)
+ ) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] =
+ (uint16_t)env->vfp.vreg[src2].u8[j] -
+ (uint16_t)((uint8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] =
+ (uint32_t)env->vfp.vreg[src2].u16[j] -
+ (uint32_t)((uint16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] =
+ (uint64_t)env->vfp.vreg[src2].u32[j] -
+ (uint64_t)((uint32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)
+ ) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] =
+ (int16_t)env->vfp.vreg[src2].s8[j] -
+ (int16_t)env->vfp.vreg[src1].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] =
+ (int32_t)env->vfp.vreg[src2].s16[j] -
+ (int32_t)env->vfp.vreg[src1].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] =
+ (int64_t)env->vfp.vreg[src2].s32[j] -
+ (int64_t)env->vfp.vreg[src1].s32[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vwsub_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)
+ ) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] =
+ (int16_t)((int8_t)env->vfp.vreg[src2].s8[j]) -
+ (int16_t)((int8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] =
+ (int32_t)((int16_t)env->vfp.vreg[src2].s16[j]) -
+ (int32_t)((int16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] =
+ (int64_t)((int32_t)env->vfp.vreg[src2].s32[j]) -
+ (int64_t)((int32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwaddu_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] =
+ (uint16_t)env->vfp.vreg[src1].u8[j] +
+ (uint16_t)env->vfp.vreg[src2].u16[k];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] =
+ (uint32_t)env->vfp.vreg[src1].u16[j] +
+ (uint32_t)env->vfp.vreg[src2].u32[k];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] =
+ (uint64_t)env->vfp.vreg[src1].u32[j] +
+ (uint64_t)env->vfp.vreg[src2].u64[k];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwaddu_wx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / (2 * width)));
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] =
+ (uint16_t)env->vfp.vreg[src2].u16[k] +
+ (uint16_t)((uint8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] =
+ (uint32_t)env->vfp.vreg[src2].u32[k] +
+ (uint32_t)((uint16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] =
+ (uint64_t)env->vfp.vreg[src2].u64[k] +
+ (uint64_t)((uint32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwadd_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] =
+ (int16_t)((int8_t)env->vfp.vreg[src1].s8[j]) +
+ (int16_t)env->vfp.vreg[src2].s16[k];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] =
+ (int32_t)((int16_t)env->vfp.vreg[src1].s16[j]) +
+ (int32_t)env->vfp.vreg[src2].s32[k];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] =
+ (int64_t)((int32_t)env->vfp.vreg[src1].s32[j]) +
+ (int64_t)env->vfp.vreg[src2].s64[k];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwadd_wx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / (2 * width)));
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] =
+ (int16_t)env->vfp.vreg[src2].s16[k] +
+ (int16_t)((int8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] =
+ (int32_t)env->vfp.vreg[src2].s32[k] +
+ (int32_t)((int16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] =
+ (int64_t)env->vfp.vreg[src2].s64[k] +
+ (int64_t)((int32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwsubu_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] =
+ (uint16_t)env->vfp.vreg[src2].u16[k] -
+ (uint16_t)env->vfp.vreg[src1].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] =
+ (uint32_t)env->vfp.vreg[src2].u32[k] -
+ (uint32_t)env->vfp.vreg[src1].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] =
+ (uint64_t)env->vfp.vreg[src2].u64[k] -
+ (uint64_t)env->vfp.vreg[src1].u32[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwsubu_wx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / (2 * width)));
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] =
+ (uint16_t)env->vfp.vreg[src2].u16[k] -
+ (uint16_t)((uint8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] =
+ (uint32_t)env->vfp.vreg[src2].u32[k] -
+ (uint32_t)((uint16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] =
+ (uint64_t)env->vfp.vreg[src2].u64[k] -
+ (uint64_t)((uint32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwsub_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] =
+ (int16_t)env->vfp.vreg[src2].s16[k] -
+ (int16_t)((int8_t)env->vfp.vreg[src1].s8[j]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] =
+ (int32_t)env->vfp.vreg[src2].s32[k] -
+ (int32_t)((int16_t)env->vfp.vreg[src1].s16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] =
+ (int64_t)env->vfp.vreg[src2].s64[k] -
+ (int64_t)((int32_t)env->vfp.vreg[src1].s32[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwsub_wx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / (2 * width)));
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] =
+ (int16_t)env->vfp.vreg[src2].s16[k] -
+ (int16_t)((int8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] =
+ (int32_t)env->vfp.vreg[src2].s32[k] -
+ (int32_t)((int16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] =
+ (int64_t)env->vfp.vreg[src2].s64[k] -
+ (int64_t)((int32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/17] RISC-V: add vector extension integer instructions part1, add/sub/adc/sbc
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 08/17] RISC-V: add vector extension integer instructions part1, add/sub/adc/sbc liuzhiwei
@ 2019-09-12 15:27 ` Richard Henderson
2019-09-12 15:35 ` Richard Henderson
0 siblings, 1 reply; 43+ messages in thread
From: Richard Henderson @ 2019-09-12 15:27 UTC (permalink / raw)
To: liuzhiwei, Alistair.Francis, palmer, sagark, kbastian,
riku.voipio, laurent, wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768
On 9/11/19 2:25 AM, liuzhiwei wrote:
> #define VECTOR_HELPER(name) HELPER(glue(vector_, name))
> +#define SIGNBIT8 (1 << 7)
> +#define SIGNBIT16 (1 << 15)
> +#define SIGNBIT32 (1 << 31)
> +#define SIGNBIT64 ((uint64_t)1 << 63)
Perhaps make up your mind if you want signed or unsigned values? Perhaps just
use or redefine INT<N>_MIN instead?
> +static int64_t extend_gpr(target_ulong reg)
> +{
> + return sign_extend(reg, sizeof(target_ulong) * 8);
> +}
Note wrt usage:
+ extend_rs1 = (uint64_t)extend_gpr(env->gpr[rs1]);
This is equivalent to "extend_rs1 = (target_long)env->gpr[rs1]".
I don't see how this helper function is helping, really.
Also, pass gprs by value, not by index.
> +static inline int vector_get_carry(CPURISCVState *env, int width, int lmul,
> + int index)
> +{
> + int mlen = width / lmul;
> + int idx = (index * mlen) / 8;
> + int pos = (index * mlen) % 8;
> +
> + return (env->vfp.vreg[0].u8[idx] >> pos) & 0x1;
> +}
Any reason not to re-use vector_elem_mask?
> +static inline uint64_t u64xu64_lh(uint64_t a, uint64_t b)
> +{
> + uint64_t hi_64, carry;
> +
> + /* first get the whole product in {hi_64, lo_64} */
> + uint64_t a_hi = a >> 32;
> + uint64_t a_lo = (uint32_t)a;
> + uint64_t b_hi = b >> 32;
> + uint64_t b_lo = (uint32_t)b;
> +
> + /*
> + * a * b = (a_hi << 32 + a_lo) * (b_hi << 32 + b_lo)
> + * = (a_hi * b_hi) << 64 + (a_hi * b_lo) << 32 +
> + * (a_lo * b_hi) << 32 + a_lo * b_lo
> + * = {hi_64, lo_64}
> + * hi_64 = ((a_hi * b_lo) << 32 + (a_lo * b_hi) << 32 + (a_lo * b_lo)) >> 64
> + * = (a_hi * b_lo) >> 32 + (a_lo * b_hi) >> 32 + carry
> + * carry = ((uint64_t)(uint32_t)(a_hi * b_lo) +
> + * (uint64_t)(uint32_t)(a_lo * b_hi) + (a_lo * b_lo) >> 32) >> 32
> + */
> +
> + carry = ((uint64_t)(uint32_t)(a_hi * b_lo) +
> + (uint64_t)(uint32_t)(a_lo * b_hi) +
> + ((a_lo * b_lo) >> 32)) >> 32;
> +
> + hi_64 = a_hi * b_hi +
> + ((a_hi * b_lo) >> 32) + ((a_lo * b_hi) >> 32) +
> + carry;
> +
> + return hi_64;
> +}
Use mulu64().
> +static inline int64_t s64xu64_lh(int64_t a, uint64_t b)
> +{
> + uint64_t abs_a = a;
> + uint64_t lo_64, hi_64;
> +
> + if (a < 0) {
> + abs_a = ~a + 1;
abs_a = -a
> +static inline int64_t s64xs64_lh(int64_t a, int64_t b)
> +{
> + uint64_t abs_a = a, abs_b = b;
> + uint64_t lo_64, hi_64;
> +
> + if (a < 0) {
> + abs_a = ~a + 1;
> + }
> + if (b < 0) {
> + abs_b = ~b + 1;
> + }
> + lo_64 = abs_a * abs_b;
> + hi_64 = u64xu64_lh(abs_a, abs_b);
> +
> + if ((a ^ b) & SIGNBIT64) {
> + lo_64 = ~lo_64;
> + hi_64 = ~hi_64;
> + if (lo_64 == UINT64_MAX) {
> + lo_64 = 0;
> + hi_64 += 1;
> + } else {
> + lo_64 += 1;
> + }
> + }
> + return hi_64;
> +}
Use muls64().
> +void VECTOR_HELPER(vadc_vvm)(CPURISCVState *env, uint32_t rs1,
> + uint32_t rs2, uint32_t rd)
> +{
> + int i, j, vl;
> + uint32_t lmul, width, src1, src2, dest, vlmax, carry;
> +
> + vl = env->vfp.vl;
> + lmul = vector_get_lmul(env);
> + width = vector_get_width(env);
> + vlmax = vector_get_vlmax(env);
> +
> + if (vector_vtype_ill(env) || vector_overlap_carry(lmul, rd)) {
> + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
> + return;
> + }
> + vector_lmul_check_reg(env, lmul, rs1, false);
> + vector_lmul_check_reg(env, lmul, rs2, false);
> + vector_lmul_check_reg(env, lmul, rd, false);
> +
> + for (i = 0; i < vlmax; i++) {
> + src1 = rs1 + (i / (VLEN / width));
> + src2 = rs2 + (i / (VLEN / width));
> + dest = rd + (i / (VLEN / width));
> + j = i % (VLEN / width);
> + if (i < env->vfp.vstart) {
> + continue;
Again, hoist.
> + } else if (i < vl) {
I would think this too could be moved into the loop condition.
> + switch (width) {
> + case 8:
> + carry = vector_get_carry(env, width, lmul, i);
> + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j]
> + + env->vfp.vreg[src2].u8[j] + carry;
> + break;
> + case 16:
> + carry = vector_get_carry(env, width, lmul, i);
> + env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j]
> + + env->vfp.vreg[src2].u16[j] + carry;
> + break;
> + case 32:
> + carry = vector_get_carry(env, width, lmul, i);
> + env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j]
> + + env->vfp.vreg[src2].u32[j] + carry;
> + break;
> + case 64:
> + carry = vector_get_carry(env, width, lmul, i);
> + env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src1].u64[j]
> + + env->vfp.vreg[src2].u64[j] + carry;
> + break;
> + default:
> + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
> + break;
> + }
> + } else {
> + vector_tail_common(env, dest, j, width);
With this tail clearing being done as a loop of its own, which would devolve to
memset on a little-endian host.
> + }
> + }
> + env->vfp.vstart = 0;
> +}
> +void VECTOR_HELPER(vadc_vxm)(CPURISCVState *env, uint32_t rs1,
> + uint32_t rs2, uint32_t rd)
> +{
Watch the spacing between functions.
Pass gpr rs1 by value.
> +void VECTOR_HELPER(vadc_vim)(CPURISCVState *env, uint32_t rs1,
> + uint32_t rs2, uint32_t rd)
> +{
...
> + env->vfp.vreg[dest].u8[j] = sign_extend(rs1, 5)
Pass the immediate as a sign-extended immediate to begin with, not as an
unsigned 5-bit field.
All of the rest of the helpers are about the same.
Consider creating a helper function that contains the basic outline of the
vector processing, and takes a (set of) function pointers that perform the
operation. With optimization, compiler inlining should produce the same code
as you have here without having to replicate quite so much code for each
helper. You can also fix a bug in the basic outline in one place instead of
hundreds.
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 08/17] RISC-V: add vector extension integer instructions part1, add/sub/adc/sbc
2019-09-12 15:27 ` Richard Henderson
@ 2019-09-12 15:35 ` Richard Henderson
0 siblings, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2019-09-12 15:35 UTC (permalink / raw)
To: liuzhiwei, Alistair.Francis, palmer, sagark, kbastian,
riku.voipio, laurent, wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768
On 9/12/19 11:27 AM, Richard Henderson wrote:
>> +void VECTOR_HELPER(vadc_vxm)(CPURISCVState *env, uint32_t rs1,
>> + uint32_t rs2, uint32_t rd)
>> +{
>
> Watch the spacing between functions.
> Pass gpr rs1 by value.
>
>> +void VECTOR_HELPER(vadc_vim)(CPURISCVState *env, uint32_t rs1,
>> + uint32_t rs2, uint32_t rd)
>> +{
> ...
>> + env->vfp.vreg[dest].u8[j] = sign_extend(rs1, 5)
>
> Pass the immediate as a sign-extended immediate to begin with, not as an
> unsigned 5-bit field.
Oh, and of course *_vxm and *_vim should be identical, because in both cases
there is a single scalar parameter. In the first case the scalar is passed by
value from the gpr; in the second case the scalar is the sign-extended constant.
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 09/17] RISC-V: add vector extension integer instructions part2, bit/shift
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (7 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 08/17] RISC-V: add vector extension integer instructions part1, add/sub/adc/sbc liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-12 16:41 ` Richard Henderson
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 10/17] RISC-V: add vector extension integer instructions part3, cmp/min/max liuzhiwei
` (8 subsequent siblings)
17 siblings, 1 reply; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/helper.h | 25 +
target/riscv/insn32.decode | 25 +
target/riscv/insn_trans/trans_rvv.inc.c | 25 +
target/riscv/vector_helper.c | 1477 +++++++++++++++++++++++++++++++
4 files changed, 1552 insertions(+)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 31e20dc..28863e2 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -175,5 +175,30 @@ DEF_HELPER_5(vector_vwsubu_wx, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vwsub_wv, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vwsub_wx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vand_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vand_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vand_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vor_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vor_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vor_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vxor_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vxor_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vxor_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsll_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsll_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsll_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsrl_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsrl_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsrl_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsra_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsra_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsra_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnsrl_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnsrl_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnsrl_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnsra_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnsra_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnsra_vi, void, env, i32, i32, i32, i32)
+
DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index fc7e498..19710f5 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -315,5 +315,30 @@ vsbc_vxm 010010 1 ..... ..... 100 ..... 1010111 @r
vmsbc_vvm 010011 1 ..... ..... 000 ..... 1010111 @r
vmsbc_vxm 010011 1 ..... ..... 100 ..... 1010111 @r
+vand_vv 001001 . ..... ..... 000 ..... 1010111 @r_vm
+vand_vx 001001 . ..... ..... 100 ..... 1010111 @r_vm
+vand_vi 001001 . ..... ..... 011 ..... 1010111 @r_vm
+vor_vv 001010 . ..... ..... 000 ..... 1010111 @r_vm
+vor_vx 001010 . ..... ..... 100 ..... 1010111 @r_vm
+vor_vi 001010 . ..... ..... 011 ..... 1010111 @r_vm
+vxor_vv 001011 . ..... ..... 000 ..... 1010111 @r_vm
+vxor_vx 001011 . ..... ..... 100 ..... 1010111 @r_vm
+vxor_vi 001011 . ..... ..... 011 ..... 1010111 @r_vm
+vsll_vv 100101 . ..... ..... 000 ..... 1010111 @r_vm
+vsll_vx 100101 . ..... ..... 100 ..... 1010111 @r_vm
+vsll_vi 100101 . ..... ..... 011 ..... 1010111 @r_vm
+vsrl_vv 101000 . ..... ..... 000 ..... 1010111 @r_vm
+vsrl_vx 101000 . ..... ..... 100 ..... 1010111 @r_vm
+vsrl_vi 101000 . ..... ..... 011 ..... 1010111 @r_vm
+vsra_vv 101001 . ..... ..... 000 ..... 1010111 @r_vm
+vsra_vx 101001 . ..... ..... 100 ..... 1010111 @r_vm
+vsra_vi 101001 . ..... ..... 011 ..... 1010111 @r_vm
+vnsrl_vv 101100 . ..... ..... 000 ..... 1010111 @r_vm
+vnsrl_vx 101100 . ..... ..... 100 ..... 1010111 @r_vm
+vnsrl_vi 101100 . ..... ..... 011 ..... 1010111 @r_vm
+vnsra_vv 101101 . ..... ..... 000 ..... 1010111 @r_vm
+vnsra_vx 101101 . ..... ..... 100 ..... 1010111 @r_vm
+vnsra_vi 101101 . ..... ..... 011 ..... 1010111 @r_vm
+
vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index a1c1960..6af29d0 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -204,5 +204,30 @@ GEN_VECTOR_R_VM(vwsubu_wx)
GEN_VECTOR_R_VM(vwsub_wv)
GEN_VECTOR_R_VM(vwsub_wx)
+GEN_VECTOR_R_VM(vand_vv)
+GEN_VECTOR_R_VM(vand_vx)
+GEN_VECTOR_R_VM(vand_vi)
+GEN_VECTOR_R_VM(vor_vv)
+GEN_VECTOR_R_VM(vor_vx)
+GEN_VECTOR_R_VM(vor_vi)
+GEN_VECTOR_R_VM(vxor_vv)
+GEN_VECTOR_R_VM(vxor_vx)
+GEN_VECTOR_R_VM(vxor_vi)
+GEN_VECTOR_R_VM(vsll_vv)
+GEN_VECTOR_R_VM(vsll_vx)
+GEN_VECTOR_R_VM(vsll_vi)
+GEN_VECTOR_R_VM(vsrl_vv)
+GEN_VECTOR_R_VM(vsrl_vx)
+GEN_VECTOR_R_VM(vsrl_vi)
+GEN_VECTOR_R_VM(vsra_vv)
+GEN_VECTOR_R_VM(vsra_vx)
+GEN_VECTOR_R_VM(vsra_vi)
+GEN_VECTOR_R_VM(vnsrl_vv)
+GEN_VECTOR_R_VM(vnsrl_vx)
+GEN_VECTOR_R_VM(vnsrl_vi)
+GEN_VECTOR_R_VM(vnsra_vv)
+GEN_VECTOR_R_VM(vnsra_vx)
+GEN_VECTOR_R_VM(vnsra_vi)
+
GEN_VECTOR_R2_ZIMM(vsetvli)
GEN_VECTOR_R(vsetvl)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 95336c9..298a10a 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -268,6 +268,25 @@ static void vector_tail_widen(CPURISCVState *env, int vreg, int index,
}
}
+static void vector_tail_narrow(CPURISCVState *env, int vreg, int index,
+ int width)
+{
+ switch (width) {
+ case 8:
+ env->vfp.vreg[vreg].u8[index] = 0;
+ break;
+ case 16:
+ env->vfp.vreg[vreg].u16[index] = 0;
+ break;
+ case 32:
+ env->vfp.vreg[vreg].u32[index] = 0;
+ break;
+ default:
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ return;
+ }
+}
+
static inline int vector_get_carry(CPURISCVState *env, int width, int lmul,
int index)
{
@@ -7131,3 +7150,1461 @@ void VECTOR_HELPER(vwsub_wx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
}
env->vfp.vstart = 0;
}
+
+void VECTOR_HELPER(vand_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j]
+ & env->vfp.vreg[src2].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j]
+ & env->vfp.vreg[src2].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j]
+ & env->vfp.vreg[src2].u32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src1].u64[j]
+ & env->vfp.vreg[src2].u64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vand_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->gpr[rs1]
+ & env->vfp.vreg[src2].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->gpr[rs1]
+ & env->vfp.vreg[src2].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->gpr[rs1]
+ & env->vfp.vreg[src2].u32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] =
+ (uint64_t)extend_gpr(env->gpr[rs1])
+ & env->vfp.vreg[src2].u64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vand_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = sign_extend(rs1, 5)
+ & env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = sign_extend(rs1, 5)
+ & env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = sign_extend(rs1, 5)
+ & env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = sign_extend(rs1, 5)
+ & env->vfp.vreg[src2].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vor_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j]
+ | env->vfp.vreg[src2].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j]
+ | env->vfp.vreg[src2].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j]
+ | env->vfp.vreg[src2].u32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src1].u64[j]
+ | env->vfp.vreg[src2].u64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vor_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->gpr[rs1]
+ | env->vfp.vreg[src2].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->gpr[rs1]
+ | env->vfp.vreg[src2].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->gpr[rs1]
+ | env->vfp.vreg[src2].u32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] =
+ (uint64_t)extend_gpr(env->gpr[rs1])
+ | env->vfp.vreg[src2].u64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vor_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = sign_extend(rs1, 5)
+ | env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = sign_extend(rs1, 5)
+ | env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = sign_extend(rs1, 5)
+ | env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = sign_extend(rs1, 5)
+ | env->vfp.vreg[src2].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vxor_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j]
+ ^ env->vfp.vreg[src2].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j]
+ ^ env->vfp.vreg[src2].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j]
+ ^ env->vfp.vreg[src2].u32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src1].u64[j]
+ ^ env->vfp.vreg[src2].u64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vxor_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->gpr[rs1]
+ ^ env->vfp.vreg[src2].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->gpr[rs1]
+ ^ env->vfp.vreg[src2].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->gpr[rs1]
+ ^ env->vfp.vreg[src2].u32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] =
+ (uint64_t)extend_gpr(env->gpr[rs1])
+ ^ env->vfp.vreg[src2].u64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vxor_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = sign_extend(rs1, 5)
+ ^ env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = sign_extend(rs1, 5)
+ ^ env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = sign_extend(rs1, 5)
+ ^ env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = sign_extend(rs1, 5)
+ ^ env->vfp.vreg[src2].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsll_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j]
+ << (env->vfp.vreg[src1].u8[j] & 0x7);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ << (env->vfp.vreg[src1].u16[j] & 0xf);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ << (env->vfp.vreg[src1].u32[j] & 0x1f);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ << (env->vfp.vreg[src1].u64[j] & 0x3f);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsll_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j]
+ << (env->gpr[rs1] & 0x7);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ << (env->gpr[rs1] & 0xf);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ << (env->gpr[rs1] & 0x1f);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ << ((uint64_t)extend_gpr(env->gpr[rs1]) & 0x3f);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsll_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j]
+ << (rs1);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ << (rs1);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ << (rs1);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ << (rs1);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsrl_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j]
+ >> (env->vfp.vreg[src1].u8[j] & 0x7);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ >> (env->vfp.vreg[src1].u16[j] & 0xf);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ >> (env->vfp.vreg[src1].u32[j] & 0x1f);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ >> (env->vfp.vreg[src1].u64[j] & 0x3f);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vsrl_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j]
+ >> (env->gpr[rs1] & 0x7);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ >> (env->gpr[rs1] & 0xf);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ >> (env->gpr[rs1] & 0x1f);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ >> ((uint64_t)extend_gpr(env->gpr[rs1]) & 0x3f);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vsrl_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j]
+ >> (rs1);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ >> (rs1);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ >> (rs1);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ >> (rs1);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsra_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j]
+ >> (env->vfp.vreg[src1].s8[j] & 0x7);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j]
+ >> (env->vfp.vreg[src1].s16[j] & 0xf);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j]
+ >> (env->vfp.vreg[src1].s32[j] & 0x1f);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j]
+ >> (env->vfp.vreg[src1].s64[j] & 0x3f);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsra_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j]
+ >> (env->gpr[rs1] & 0x7);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j]
+ >> (env->gpr[rs1] & 0xf);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j]
+ >> (env->gpr[rs1] & 0x1f);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j]
+ >> ((uint64_t)extend_gpr(env->gpr[rs1]) & 0x3f);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vsra_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j]
+ >> (rs1);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j]
+ >> (rs1);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j]
+ >> (rs1);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j]
+ >> (rs1);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vnsrl_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) ||
+ vector_overlap_vm_common(lmul, vm, rd) ||
+ vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u16[k]
+ >> (env->vfp.vreg[src1].u8[j] & 0xf);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u32[k]
+ >> (env->vfp.vreg[src1].u16[j] & 0x1f);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u64[k]
+ >> (env->vfp.vreg[src1].u32[j] & 0x3f);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_narrow(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vnsrl_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) ||
+ vector_overlap_vm_common(lmul, vm, rd) ||
+ vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u16[k]
+ >> (env->gpr[rs1] & 0xf);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u32[k]
+ >> (env->gpr[rs1] & 0x1f);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u64[k]
+ >> (env->gpr[rs1] & 0x3f);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_narrow(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vnsrl_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) ||
+ vector_overlap_vm_common(lmul, vm, rd) ||
+ vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u16[k]
+ >> (rs1);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u32[k]
+ >> (rs1);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u64[k]
+ >> (rs1);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_narrow(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vnsra_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) ||
+ vector_overlap_vm_common(lmul, vm, rd) ||
+ vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s16[k]
+ >> (env->vfp.vreg[src1].s8[j] & 0xf);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s32[k]
+ >> (env->vfp.vreg[src1].s16[j] & 0x1f);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s64[k]
+ >> (env->vfp.vreg[src1].s32[j] & 0x3f);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_narrow(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vnsra_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) ||
+ vector_overlap_vm_common(lmul, vm, rd) ||
+ vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s16[k]
+ >> (env->gpr[rs1] & 0xf);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s32[k]
+ >> (env->gpr[rs1] & 0x1f);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s64[k]
+ >> (env->gpr[rs1] & 0x3f);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_narrow(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vnsra_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) ||
+ vector_overlap_vm_common(lmul, vm, rd) ||
+ vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s16[k]
+ >> (rs1);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s32[k]
+ >> (rs1);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s64[k]
+ >> (rs1);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_narrow(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 09/17] RISC-V: add vector extension integer instructions part2, bit/shift
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 09/17] RISC-V: add vector extension integer instructions part2, bit/shift liuzhiwei
@ 2019-09-12 16:41 ` Richard Henderson
0 siblings, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2019-09-12 16:41 UTC (permalink / raw)
To: liuzhiwei, Alistair.Francis, palmer, sagark, kbastian,
riku.voipio, laurent, wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768
On 9/11/19 2:25 AM, liuzhiwei wrote:
> +void VECTOR_HELPER(vand_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
> + uint32_t rs2, uint32_t rd)
> +{
> + int i, j, vl;
> + uint32_t lmul, width, src1, src2, dest, vlmax;
> +
> + vl = env->vfp.vl;
> + lmul = vector_get_lmul(env);
> + width = vector_get_width(env);
> + vlmax = vector_get_vlmax(env);
> +
> + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
> + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
> + return;
> + }
> + vector_lmul_check_reg(env, lmul, rs1, false);
> + vector_lmul_check_reg(env, lmul, rs2, false);
> + vector_lmul_check_reg(env, lmul, rd, false);
> +
> + for (i = 0; i < vlmax; i++) {
> + src1 = rs1 + (i / (VLEN / width));
> + src2 = rs2 + (i / (VLEN / width));
> + dest = rd + (i / (VLEN / width));
> + j = i % (VLEN / width);
> + if (i < env->vfp.vstart) {
> + continue;
> + } else if (i < vl) {
> + switch (width) {
> + case 8:
> + if (vector_elem_mask(env, vm, width, lmul, i)) {
> + env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j]
> + & env->vfp.vreg[src2].u8[j];
> + }
> + break;
Note that a non-predicated logical operation need not consider the width. All
of the widths perform the same operation, and therefore having the host operate
on u64 is fastest. This is another good reason to notice vm=1 within the
translator and use separate helper functions for masked vs non-masked.
> +void VECTOR_HELPER(vand_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
> + uint32_t rs2, uint32_t rd)
...
> +void VECTOR_HELPER(vand_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
> + uint32_t rs2, uint32_t rd)
As with the previous set of arithmetic instructions, these should be a single
helper that is passed a 64-bit scalar.
Note that scalars smaller than 64-bit can be replicated with dup_const(). At
which point the logical operation is easily performed in 64-bit units instead
of any smaller unit.
Note that predication can be handled via logical masking. For ARM SVE, we have
a set of functions that map the active bits of a predicate mask to byte masks.
See e.g.
static inline uint64_t expand_pred_b(uint8_t byte)
static inline uint64_t expand_pred_h(uint8_t byte)
static inline uint64_t expand_pred_s(uint8_t byte)
so that the predicated logical and operation looks like
mask = expand_pred_n(env->vfp.vreg[0].u8[i]);
result = in1 & in2;
dest = (result & mask) | (dest & ~mask);
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 10/17] RISC-V: add vector extension integer instructions part3, cmp/min/max
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (8 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 09/17] RISC-V: add vector extension integer instructions part2, bit/shift liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 11/17] RISC-V: add vector extension integer instructions part4, mul/div/merge liuzhiwei
` (7 subsequent siblings)
17 siblings, 0 replies; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/helper.h | 29 +
target/riscv/insn32.decode | 29 +
target/riscv/insn_trans/trans_rvv.inc.c | 29 +
target/riscv/vector_helper.c | 2280 +++++++++++++++++++++++++++++++
4 files changed, 2367 insertions(+)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 28863e2..7354b12 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -200,5 +200,34 @@ DEF_HELPER_5(vector_vnsra_vv, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vnsra_vx, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vnsra_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vminu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vminu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmin_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmin_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmaxu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmaxu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmax_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmax_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmseq_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmseq_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmseq_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsne_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsne_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsne_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsltu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsltu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmslt_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmslt_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsleu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsleu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsleu_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsle_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsle_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsle_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsgtu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsgtu_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsgt_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmsgt_vi, void, env, i32, i32, i32, i32)
+
DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 19710f5..1ff0b08 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -340,5 +340,34 @@ vnsra_vv 101101 . ..... ..... 000 ..... 1010111 @r_vm
vnsra_vx 101101 . ..... ..... 100 ..... 1010111 @r_vm
vnsra_vi 101101 . ..... ..... 011 ..... 1010111 @r_vm
+vmseq_vv 011000 . ..... ..... 000 ..... 1010111 @r_vm
+vmseq_vx 011000 . ..... ..... 100 ..... 1010111 @r_vm
+vmseq_vi 011000 . ..... ..... 011 ..... 1010111 @r_vm
+vmsne_vv 011001 . ..... ..... 000 ..... 1010111 @r_vm
+vmsne_vx 011001 . ..... ..... 100 ..... 1010111 @r_vm
+vmsne_vi 011001 . ..... ..... 011 ..... 1010111 @r_vm
+vmsltu_vv 011010 . ..... ..... 000 ..... 1010111 @r_vm
+vmsltu_vx 011010 . ..... ..... 100 ..... 1010111 @r_vm
+vmslt_vv 011011 . ..... ..... 000 ..... 1010111 @r_vm
+vmslt_vx 011011 . ..... ..... 100 ..... 1010111 @r_vm
+vmsleu_vv 011100 . ..... ..... 000 ..... 1010111 @r_vm
+vmsleu_vx 011100 . ..... ..... 100 ..... 1010111 @r_vm
+vmsleu_vi 011100 . ..... ..... 011 ..... 1010111 @r_vm
+vmsle_vv 011101 . ..... ..... 000 ..... 1010111 @r_vm
+vmsle_vx 011101 . ..... ..... 100 ..... 1010111 @r_vm
+vmsle_vi 011101 . ..... ..... 011 ..... 1010111 @r_vm
+vmsgtu_vx 011110 . ..... ..... 100 ..... 1010111 @r_vm
+vmsgtu_vi 011110 . ..... ..... 011 ..... 1010111 @r_vm
+vmsgt_vx 011111 . ..... ..... 100 ..... 1010111 @r_vm
+vmsgt_vi 011111 . ..... ..... 011 ..... 1010111 @r_vm
+vminu_vv 000100 . ..... ..... 000 ..... 1010111 @r_vm
+vminu_vx 000100 . ..... ..... 100 ..... 1010111 @r_vm
+vmin_vv 000101 . ..... ..... 000 ..... 1010111 @r_vm
+vmin_vx 000101 . ..... ..... 100 ..... 1010111 @r_vm
+vmaxu_vv 000110 . ..... ..... 000 ..... 1010111 @r_vm
+vmaxu_vx 000110 . ..... ..... 100 ..... 1010111 @r_vm
+vmax_vv 000111 . ..... ..... 000 ..... 1010111 @r_vm
+vmax_vx 000111 . ..... ..... 100 ..... 1010111 @r_vm
+
vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index 6af29d0..cd5ab07 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -229,5 +229,34 @@ GEN_VECTOR_R_VM(vnsra_vv)
GEN_VECTOR_R_VM(vnsra_vx)
GEN_VECTOR_R_VM(vnsra_vi)
+GEN_VECTOR_R_VM(vmseq_vv)
+GEN_VECTOR_R_VM(vmseq_vx)
+GEN_VECTOR_R_VM(vmseq_vi)
+GEN_VECTOR_R_VM(vmsne_vv)
+GEN_VECTOR_R_VM(vmsne_vx)
+GEN_VECTOR_R_VM(vmsne_vi)
+GEN_VECTOR_R_VM(vmsltu_vv)
+GEN_VECTOR_R_VM(vmsltu_vx)
+GEN_VECTOR_R_VM(vmslt_vv)
+GEN_VECTOR_R_VM(vmslt_vx)
+GEN_VECTOR_R_VM(vmsleu_vv)
+GEN_VECTOR_R_VM(vmsleu_vx)
+GEN_VECTOR_R_VM(vmsleu_vi)
+GEN_VECTOR_R_VM(vmsle_vv)
+GEN_VECTOR_R_VM(vmsle_vx)
+GEN_VECTOR_R_VM(vmsle_vi)
+GEN_VECTOR_R_VM(vmsgtu_vx)
+GEN_VECTOR_R_VM(vmsgtu_vi)
+GEN_VECTOR_R_VM(vmsgt_vx)
+GEN_VECTOR_R_VM(vmsgt_vi)
+GEN_VECTOR_R_VM(vminu_vv)
+GEN_VECTOR_R_VM(vminu_vx)
+GEN_VECTOR_R_VM(vmin_vv)
+GEN_VECTOR_R_VM(vmin_vx)
+GEN_VECTOR_R_VM(vmaxu_vv)
+GEN_VECTOR_R_VM(vmaxu_vx)
+GEN_VECTOR_R_VM(vmax_vv)
+GEN_VECTOR_R_VM(vmax_vx)
+
GEN_VECTOR_R2_ZIMM(vsetvli)
GEN_VECTOR_R(vsetvl)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 298a10a..fbf2145 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -8608,3 +8608,2283 @@ void VECTOR_HELPER(vnsra_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
env->vfp.vstart = 0;
}
+void VECTOR_HELPER(vmseq_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u8[j] ==
+ env->vfp.vreg[src2].u8[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u16[j] ==
+ env->vfp.vreg[src2].u16[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u32[j] ==
+ env->vfp.vreg[src2].u32[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u64[j] ==
+ env->vfp.vreg[src2].u64[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmseq_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint8_t)env->gpr[rs1] == env->vfp.vreg[src2].u8[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint16_t)env->gpr[rs1] == env->vfp.vreg[src2].u16[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint32_t)env->gpr[rs1] == env->vfp.vreg[src2].u32[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint64_t)extend_gpr(env->gpr[rs1]) ==
+ env->vfp.vreg[src2].u64[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmseq_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint8_t)sign_extend(rs1, 5)
+ == env->vfp.vreg[src2].u8[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint16_t)sign_extend(rs1, 5)
+ == env->vfp.vreg[src2].u16[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint32_t)sign_extend(rs1, 5)
+ == env->vfp.vreg[src2].u32[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint64_t)sign_extend(rs1, 5) ==
+ env->vfp.vreg[src2].u64[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmsne_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u8[j] !=
+ env->vfp.vreg[src2].u8[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u16[j] !=
+ env->vfp.vreg[src2].u16[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u32[j] !=
+ env->vfp.vreg[src2].u32[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u64[j] !=
+ env->vfp.vreg[src2].u64[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmsne_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint8_t)env->gpr[rs1] != env->vfp.vreg[src2].u8[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint16_t)env->gpr[rs1] != env->vfp.vreg[src2].u16[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint32_t)env->gpr[rs1] != env->vfp.vreg[src2].u32[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint64_t)extend_gpr(env->gpr[rs1]) !=
+ env->vfp.vreg[src2].u64[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmsne_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint8_t)sign_extend(rs1, 5)
+ != env->vfp.vreg[src2].u8[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint16_t)sign_extend(rs1, 5)
+ != env->vfp.vreg[src2].u16[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint32_t)sign_extend(rs1, 5)
+ != env->vfp.vreg[src2].u32[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint64_t)sign_extend(rs1, 5) !=
+ env->vfp.vreg[src2].u64[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmsltu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u8[j] <
+ env->vfp.vreg[src1].u8[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u16[j] <
+ env->vfp.vreg[src1].u16[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u32[j] <
+ env->vfp.vreg[src1].u32[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u64[j] <
+ env->vfp.vreg[src1].u64[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmsltu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u8[j] < (uint8_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u16[j] < (uint16_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u32[j] < (uint32_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u64[j] <
+ (uint64_t)extend_gpr(env->gpr[rs1])) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmslt_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s8[j] <
+ env->vfp.vreg[src1].s8[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s16[j] <
+ env->vfp.vreg[src1].s16[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s32[j] <
+ env->vfp.vreg[src1].s32[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s64[j] <
+ env->vfp.vreg[src1].s64[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmslt_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s8[j] < (int8_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s16[j] < (int16_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s32[j] < (int32_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s64[j] <
+ (int64_t)extend_gpr(env->gpr[rs1])) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmsleu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u8[j] <=
+ env->vfp.vreg[src1].u8[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u16[j] <=
+ env->vfp.vreg[src1].u16[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u32[j] <=
+ env->vfp.vreg[src1].u32[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u64[j] <=
+ env->vfp.vreg[src1].u64[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmsleu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u8[j] <= (uint8_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u16[j] <= (uint16_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u32[j] <= (uint32_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u64[j] <=
+ (uint64_t)extend_gpr(env->gpr[rs1])) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmsleu_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u8[j] <= (uint8_t)rs1) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u16[j] <= (uint16_t)rs1) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u32[j] <= (uint32_t)rs1) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u64[j] <=
+ (uint64_t)rs1) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmsle_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s8[j] <=
+ env->vfp.vreg[src1].s8[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s16[j] <=
+ env->vfp.vreg[src1].s16[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s32[j] <=
+ env->vfp.vreg[src1].s32[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s64[j] <=
+ env->vfp.vreg[src1].s64[j]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmsle_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s8[j] <= (int8_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s16[j] <= (int16_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s32[j] <= (int32_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s64[j] <=
+ (int64_t)extend_gpr(env->gpr[rs1])) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmsle_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s8[j] <=
+ (int8_t)sign_extend(rs1, 5)) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s16[j] <=
+ (int16_t)sign_extend(rs1, 5)) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s32[j] <=
+ (int32_t)sign_extend(rs1, 5)) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s64[j] <=
+ sign_extend(rs1, 5)) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmsgtu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u8[j] > (uint8_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u16[j] > (uint16_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u32[j] > (uint32_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u64[j] >
+ (uint64_t)extend_gpr(env->gpr[rs1])) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmsgtu_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u8[j] > (uint8_t)rs1) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u16[j] > (uint16_t)rs1) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u32[j] > (uint32_t)rs1) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].u64[j] >
+ (uint64_t)rs1) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmsgt_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s8[j] > (int8_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s16[j] > (int16_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s32[j] > (int32_t)env->gpr[rs1]) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s64[j] >
+ (int64_t)extend_gpr(env->gpr[rs1])) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmsgt_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, vlmax;
+
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s8[j] >
+ (int8_t)sign_extend(rs1, 5)) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s16[j] >
+ (int16_t)sign_extend(rs1, 5)) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s32[j] >
+ (int32_t)sign_extend(rs1, 5)) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src2].s64[j] >
+ sign_extend(rs1, 5)) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ if (width <= 64) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vminu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u8[j] <=
+ env->vfp.vreg[src2].u8[j]) {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src1].u8[j];
+ } else {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src2].u8[j];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u16[j] <=
+ env->vfp.vreg[src2].u16[j]) {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src1].u16[j];
+ } else {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src2].u16[j];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u32[j] <=
+ env->vfp.vreg[src2].u32[j]) {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src1].u32[j];
+ } else {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src2].u32[j];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u64[j] <=
+ env->vfp.vreg[src2].u64[j]) {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src1].u64[j];
+ } else {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src2].u64[j];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vminu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint8_t)env->gpr[rs1] <=
+ env->vfp.vreg[src2].u8[j]) {
+ env->vfp.vreg[dest].u8[j] =
+ env->gpr[rs1];
+ } else {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src2].u8[j];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint16_t)env->gpr[rs1] <=
+ env->vfp.vreg[src2].u16[j]) {
+ env->vfp.vreg[dest].u16[j] =
+ env->gpr[rs1];
+ } else {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src2].u16[j];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint32_t)env->gpr[rs1] <=
+ env->vfp.vreg[src2].u32[j]) {
+ env->vfp.vreg[dest].u32[j] =
+ env->gpr[rs1];
+ } else {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src2].u32[j];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint64_t)extend_gpr(env->gpr[rs1]) <=
+ env->vfp.vreg[src2].u64[j]) {
+ env->vfp.vreg[dest].u64[j] =
+ (uint64_t)extend_gpr(env->gpr[rs1]);
+ } else {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src2].u64[j];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmin_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s8[j] <=
+ env->vfp.vreg[src2].s8[j]) {
+ env->vfp.vreg[dest].s8[j] =
+ env->vfp.vreg[src1].s8[j];
+ } else {
+ env->vfp.vreg[dest].s8[j] =
+ env->vfp.vreg[src2].s8[j];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s16[j] <=
+ env->vfp.vreg[src2].s16[j]) {
+ env->vfp.vreg[dest].s16[j] =
+ env->vfp.vreg[src1].s16[j];
+ } else {
+ env->vfp.vreg[dest].s16[j] =
+ env->vfp.vreg[src2].s16[j];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s32[j] <=
+ env->vfp.vreg[src2].s32[j]) {
+ env->vfp.vreg[dest].s32[j] =
+ env->vfp.vreg[src1].s32[j];
+ } else {
+ env->vfp.vreg[dest].s32[j] =
+ env->vfp.vreg[src2].s32[j];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s64[j] <=
+ env->vfp.vreg[src2].s64[j]) {
+ env->vfp.vreg[dest].s64[j] =
+ env->vfp.vreg[src1].s64[j];
+ } else {
+ env->vfp.vreg[dest].s64[j] =
+ env->vfp.vreg[src2].s64[j];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmin_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int8_t)env->gpr[rs1] <=
+ env->vfp.vreg[src2].s8[j]) {
+ env->vfp.vreg[dest].s8[j] =
+ env->gpr[rs1];
+ } else {
+ env->vfp.vreg[dest].s8[j] =
+ env->vfp.vreg[src2].s8[j];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int16_t)env->gpr[rs1] <=
+ env->vfp.vreg[src2].s16[j]) {
+ env->vfp.vreg[dest].s16[j] =
+ env->gpr[rs1];
+ } else {
+ env->vfp.vreg[dest].s16[j] =
+ env->vfp.vreg[src2].s16[j];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int32_t)env->gpr[rs1] <=
+ env->vfp.vreg[src2].s32[j]) {
+ env->vfp.vreg[dest].s32[j] =
+ env->gpr[rs1];
+ } else {
+ env->vfp.vreg[dest].s32[j] =
+ env->vfp.vreg[src2].s32[j];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int64_t)extend_gpr(env->gpr[rs1]) <=
+ env->vfp.vreg[src2].s64[j]) {
+ env->vfp.vreg[dest].s64[j] =
+ (int64_t)extend_gpr(env->gpr[rs1]);
+ } else {
+ env->vfp.vreg[dest].s64[j] =
+ env->vfp.vreg[src2].s64[j];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmaxu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u8[j] >=
+ env->vfp.vreg[src2].u8[j]) {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src1].u8[j];
+ } else {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src2].u8[j];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u16[j] >=
+ env->vfp.vreg[src2].u16[j]) {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src1].u16[j];
+ } else {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src2].u16[j];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u32[j] >=
+ env->vfp.vreg[src2].u32[j]) {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src1].u32[j];
+ } else {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src2].u32[j];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u64[j] >=
+ env->vfp.vreg[src2].u64[j]) {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src1].u64[j];
+ } else {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src2].u64[j];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmaxu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint8_t)env->gpr[rs1] >=
+ env->vfp.vreg[src2].u8[j]) {
+ env->vfp.vreg[dest].u8[j] =
+ env->gpr[rs1];
+ } else {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src2].u8[j];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint16_t)env->gpr[rs1] >=
+ env->vfp.vreg[src2].u16[j]) {
+ env->vfp.vreg[dest].u16[j] =
+ env->gpr[rs1];
+ } else {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src2].u16[j];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint32_t)env->gpr[rs1] >=
+ env->vfp.vreg[src2].u32[j]) {
+ env->vfp.vreg[dest].u32[j] =
+ env->gpr[rs1];
+ } else {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src2].u32[j];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint64_t)extend_gpr(env->gpr[rs1]) >=
+ env->vfp.vreg[src2].u64[j]) {
+ env->vfp.vreg[dest].u64[j] =
+ (uint64_t)extend_gpr(env->gpr[rs1]);
+ } else {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src2].u64[j];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmax_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s8[j] >=
+ env->vfp.vreg[src2].s8[j]) {
+ env->vfp.vreg[dest].s8[j] =
+ env->vfp.vreg[src1].s8[j];
+ } else {
+ env->vfp.vreg[dest].s8[j] =
+ env->vfp.vreg[src2].s8[j];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s16[j] >=
+ env->vfp.vreg[src2].s16[j]) {
+ env->vfp.vreg[dest].s16[j] =
+ env->vfp.vreg[src1].s16[j];
+ } else {
+ env->vfp.vreg[dest].s16[j] =
+ env->vfp.vreg[src2].s16[j];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s32[j] >=
+ env->vfp.vreg[src2].s32[j]) {
+ env->vfp.vreg[dest].s32[j] =
+ env->vfp.vreg[src1].s32[j];
+ } else {
+ env->vfp.vreg[dest].s32[j] =
+ env->vfp.vreg[src2].s32[j];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s64[j] >=
+ env->vfp.vreg[src2].s64[j]) {
+ env->vfp.vreg[dest].s64[j] =
+ env->vfp.vreg[src1].s64[j];
+ } else {
+ env->vfp.vreg[dest].s64[j] =
+ env->vfp.vreg[src2].s64[j];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmax_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int8_t)env->gpr[rs1] >=
+ env->vfp.vreg[src2].s8[j]) {
+ env->vfp.vreg[dest].s8[j] =
+ env->gpr[rs1];
+ } else {
+ env->vfp.vreg[dest].s8[j] =
+ env->vfp.vreg[src2].s8[j];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int16_t)env->gpr[rs1] >=
+ env->vfp.vreg[src2].s16[j]) {
+ env->vfp.vreg[dest].s16[j] =
+ env->gpr[rs1];
+ } else {
+ env->vfp.vreg[dest].s16[j] =
+ env->vfp.vreg[src2].s16[j];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int32_t)env->gpr[rs1] >=
+ env->vfp.vreg[src2].s32[j]) {
+ env->vfp.vreg[dest].s32[j] =
+ env->gpr[rs1];
+ } else {
+ env->vfp.vreg[dest].s32[j] =
+ env->vfp.vreg[src2].s32[j];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int64_t)extend_gpr(env->gpr[rs1]) >=
+ env->vfp.vreg[src2].s64[j]) {
+ env->vfp.vreg[dest].s64[j] =
+ (int64_t)extend_gpr(env->gpr[rs1]);
+ } else {
+ env->vfp.vreg[dest].s64[j] =
+ env->vfp.vreg[src2].s64[j];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 11/17] RISC-V: add vector extension integer instructions part4, mul/div/merge
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (9 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 10/17] RISC-V: add vector extension integer instructions part3, cmp/min/max liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 12/17] RISC-V: add vector extension fixed point instructions liuzhiwei
` (6 subsequent siblings)
17 siblings, 0 replies; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/helper.h | 41 +
target/riscv/insn32.decode | 41 +
target/riscv/insn_trans/trans_rvv.inc.c | 41 +
target/riscv/vector_helper.c | 2838 +++++++++++++++++++++++++++++++
4 files changed, 2961 insertions(+)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 7354b12..ab31ef7 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -229,5 +229,46 @@ DEF_HELPER_5(vector_vmsgtu_vi, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vmsgt_vx, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vmsgt_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmul_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmul_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmulhsu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmulhsu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmulh_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmulh_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vdivu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vdivu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vdiv_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vdiv_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vremu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vremu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vrem_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vrem_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmulhu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmulhu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmadd_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmadd_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnmsub_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnmsub_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmacc_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmacc_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnmsac_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnmsac_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwmulu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwmulu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwmulsu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwmulsu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwmul_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwmul_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwmaccu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwmaccu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwmacc_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwmacc_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwmaccsu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwmaccsu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwmaccus_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmerge_vvm, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmerge_vxm, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmerge_vim, void, env, i32, i32, i32, i32)
+
DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 1ff0b08..6db18c5 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -369,5 +369,46 @@ vmaxu_vx 000110 . ..... ..... 100 ..... 1010111 @r_vm
vmax_vv 000111 . ..... ..... 000 ..... 1010111 @r_vm
vmax_vx 000111 . ..... ..... 100 ..... 1010111 @r_vm
+vmul_vv 100101 . ..... ..... 010 ..... 1010111 @r_vm
+vmul_vx 100101 . ..... ..... 110 ..... 1010111 @r_vm
+vmulhsu_vv 100110 . ..... ..... 010 ..... 1010111 @r_vm
+vmulhsu_vx 100110 . ..... ..... 110 ..... 1010111 @r_vm
+vmulh_vv 100111 . ..... ..... 010 ..... 1010111 @r_vm
+vmulh_vx 100111 . ..... ..... 110 ..... 1010111 @r_vm
+vmulhu_vv 100100 . ..... ..... 010 ..... 1010111 @r_vm
+vmulhu_vx 100100 . ..... ..... 110 ..... 1010111 @r_vm
+vdivu_vv 100000 . ..... ..... 010 ..... 1010111 @r_vm
+vdivu_vx 100000 . ..... ..... 110 ..... 1010111 @r_vm
+vdiv_vv 100001 . ..... ..... 010 ..... 1010111 @r_vm
+vdiv_vx 100001 . ..... ..... 110 ..... 1010111 @r_vm
+vremu_vv 100010 . ..... ..... 010 ..... 1010111 @r_vm
+vremu_vx 100010 . ..... ..... 110 ..... 1010111 @r_vm
+vrem_vv 100011 . ..... ..... 010 ..... 1010111 @r_vm
+vrem_vx 100011 . ..... ..... 110 ..... 1010111 @r_vm
+vwmulu_vv 111000 . ..... ..... 010 ..... 1010111 @r_vm
+vwmulu_vx 111000 . ..... ..... 110 ..... 1010111 @r_vm
+vwmulsu_vv 111010 . ..... ..... 010 ..... 1010111 @r_vm
+vwmulsu_vx 111010 . ..... ..... 110 ..... 1010111 @r_vm
+vwmul_vv 111011 . ..... ..... 010 ..... 1010111 @r_vm
+vwmul_vx 111011 . ..... ..... 110 ..... 1010111 @r_vm
+vmacc_vv 101101 . ..... ..... 010 ..... 1010111 @r_vm
+vmacc_vx 101101 . ..... ..... 110 ..... 1010111 @r_vm
+vnmsac_vv 101111 . ..... ..... 010 ..... 1010111 @r_vm
+vnmsac_vx 101111 . ..... ..... 110 ..... 1010111 @r_vm
+vmadd_vv 101001 . ..... ..... 010 ..... 1010111 @r_vm
+vmadd_vx 101001 . ..... ..... 110 ..... 1010111 @r_vm
+vnmsub_vv 101011 . ..... ..... 010 ..... 1010111 @r_vm
+vnmsub_vx 101011 . ..... ..... 110 ..... 1010111 @r_vm
+vwmaccu_vv 111100 . ..... ..... 010 ..... 1010111 @r_vm
+vwmaccu_vx 111100 . ..... ..... 110 ..... 1010111 @r_vm
+vwmacc_vv 111101 . ..... ..... 010 ..... 1010111 @r_vm
+vwmacc_vx 111101 . ..... ..... 110 ..... 1010111 @r_vm
+vwmaccsu_vv 111110 . ..... ..... 010 ..... 1010111 @r_vm
+vwmaccsu_vx 111110 . ..... ..... 110 ..... 1010111 @r_vm
+vwmaccus_vx 111111 . ..... ..... 110 ..... 1010111 @r_vm
+vmerge_vvm 010111 . ..... ..... 000 ..... 1010111 @r_vm
+vmerge_vxm 010111 . ..... ..... 100 ..... 1010111 @r_vm
+vmerge_vim 010111 . ..... ..... 011 ..... 1010111 @r_vm
+
vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index cd5ab07..1ba52e7 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -258,5 +258,46 @@ GEN_VECTOR_R_VM(vmaxu_vx)
GEN_VECTOR_R_VM(vmax_vv)
GEN_VECTOR_R_VM(vmax_vx)
+GEN_VECTOR_R_VM(vmulhu_vv)
+GEN_VECTOR_R_VM(vmulhu_vx)
+GEN_VECTOR_R_VM(vmul_vv)
+GEN_VECTOR_R_VM(vmul_vx)
+GEN_VECTOR_R_VM(vmulhsu_vv)
+GEN_VECTOR_R_VM(vmulhsu_vx)
+GEN_VECTOR_R_VM(vmulh_vv)
+GEN_VECTOR_R_VM(vmulh_vx)
+GEN_VECTOR_R_VM(vdivu_vv)
+GEN_VECTOR_R_VM(vdivu_vx)
+GEN_VECTOR_R_VM(vdiv_vv)
+GEN_VECTOR_R_VM(vdiv_vx)
+GEN_VECTOR_R_VM(vremu_vv)
+GEN_VECTOR_R_VM(vremu_vx)
+GEN_VECTOR_R_VM(vrem_vv)
+GEN_VECTOR_R_VM(vrem_vx)
+GEN_VECTOR_R_VM(vmacc_vv)
+GEN_VECTOR_R_VM(vmacc_vx)
+GEN_VECTOR_R_VM(vnmsac_vv)
+GEN_VECTOR_R_VM(vnmsac_vx)
+GEN_VECTOR_R_VM(vmadd_vv)
+GEN_VECTOR_R_VM(vmadd_vx)
+GEN_VECTOR_R_VM(vnmsub_vv)
+GEN_VECTOR_R_VM(vnmsub_vx)
+GEN_VECTOR_R_VM(vwmulu_vv)
+GEN_VECTOR_R_VM(vwmulu_vx)
+GEN_VECTOR_R_VM(vwmulsu_vv)
+GEN_VECTOR_R_VM(vwmulsu_vx)
+GEN_VECTOR_R_VM(vwmul_vv)
+GEN_VECTOR_R_VM(vwmul_vx)
+GEN_VECTOR_R_VM(vwmaccu_vv)
+GEN_VECTOR_R_VM(vwmaccu_vx)
+GEN_VECTOR_R_VM(vwmacc_vv)
+GEN_VECTOR_R_VM(vwmacc_vx)
+GEN_VECTOR_R_VM(vwmaccsu_vv)
+GEN_VECTOR_R_VM(vwmaccsu_vx)
+GEN_VECTOR_R_VM(vwmaccus_vx)
+GEN_VECTOR_R_VM(vmerge_vvm)
+GEN_VECTOR_R_VM(vmerge_vxm)
+GEN_VECTOR_R_VM(vmerge_vim)
+
GEN_VECTOR_R2_ZIMM(vsetvli)
GEN_VECTOR_R(vsetvl)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index fbf2145..49f1cb8 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -10888,3 +10888,2841 @@ void VECTOR_HELPER(vmax_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
env->vfp.vstart = 0;
}
+void VECTOR_HELPER(vmulhu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] =
+ ((uint16_t)env->vfp.vreg[src1].u8[j]
+ * (uint16_t)env->vfp.vreg[src2].u8[j]) >> width;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] =
+ ((uint32_t)env->vfp.vreg[src1].u16[j]
+ * (uint32_t)env->vfp.vreg[src2].u16[j]) >> width;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] =
+ ((uint64_t)env->vfp.vreg[src1].u32[j]
+ * (uint64_t)env->vfp.vreg[src2].u32[j]) >> width;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = u64xu64_lh(
+ env->vfp.vreg[src1].u64[j], env->vfp.vreg[src2].u64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmulhu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] =
+ ((uint16_t)(uint8_t)env->gpr[rs1]
+ * (uint16_t)env->vfp.vreg[src2].u8[j]) >> width;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] =
+ ((uint32_t)(uint16_t)env->gpr[rs1]
+ * (uint32_t)env->vfp.vreg[src2].u16[j]) >> width;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] =
+ ((uint64_t)(uint32_t)env->gpr[rs1]
+ * (uint64_t)env->vfp.vreg[src2].u32[j]) >> width;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = u64xu64_lh(
+ (uint64_t)extend_gpr(env->gpr[rs1])
+ , env->vfp.vreg[src2].u64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmul_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src1].s8[j]
+ * env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src1].s16[j]
+ * env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src1].s32[j]
+ * env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src1].s64[j]
+ * env->vfp.vreg[src2].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmul_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = env->gpr[rs1]
+ * env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = env->gpr[rs1]
+ * env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = env->gpr[rs1]
+ * env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] =
+ (int64_t)extend_gpr(env->gpr[rs1])
+ * env->vfp.vreg[src2].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmulhsu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] =
+ ((uint16_t)env->vfp.vreg[src1].u8[j]
+ * (int16_t)env->vfp.vreg[src2].s8[j]) >> width;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] =
+ ((uint32_t)env->vfp.vreg[src1].u16[j]
+ * (int32_t)env->vfp.vreg[src2].s16[j]) >> width;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] =
+ ((uint64_t)env->vfp.vreg[src1].u32[j]
+ * (int64_t)env->vfp.vreg[src2].s32[j]) >> width;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = s64xu64_lh(
+ env->vfp.vreg[src2].s64[j], env->vfp.vreg[src1].u64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmulhsu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] =
+ ((uint16_t)(uint8_t)env->gpr[rs1]
+ * (int16_t)env->vfp.vreg[src2].s8[j]) >> width;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] =
+ ((uint32_t)(uint16_t)env->gpr[rs1]
+ * (int32_t)env->vfp.vreg[src2].s16[j]) >> width;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] =
+ ((uint64_t)(uint32_t)env->gpr[rs1]
+ * (int64_t)env->vfp.vreg[src2].s32[j]) >> width;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = s64xu64_lh(
+ env->vfp.vreg[src2].s64[j],
+ (uint64_t)extend_gpr(env->gpr[rs1]));
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmulh_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] =
+ ((int16_t)env->vfp.vreg[src1].s8[j]
+ * (int16_t)env->vfp.vreg[src2].s8[j]) >> width;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] =
+ ((int32_t)env->vfp.vreg[src1].s16[j]
+ * (int32_t)env->vfp.vreg[src2].s16[j]) >> width;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] =
+ ((int64_t)env->vfp.vreg[src1].s32[j]
+ * (int64_t)env->vfp.vreg[src2].s32[j]) >> width;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = s64xs64_lh(
+ env->vfp.vreg[src1].s64[j], env->vfp.vreg[src2].s64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmulh_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] =
+ ((int16_t)(int8_t)env->gpr[rs1]
+ * (int16_t)env->vfp.vreg[src2].s8[j]) >> width;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] =
+ ((int32_t)(int16_t)env->gpr[rs1]
+ * (int32_t)env->vfp.vreg[src2].s16[j]) >> width;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] =
+ ((int64_t)(int32_t)env->gpr[rs1]
+ * (int64_t)env->vfp.vreg[src2].s32[j]) >> width;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = s64xs64_lh(
+ (int64_t)extend_gpr(env->gpr[rs1])
+ , env->vfp.vreg[src2].s64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vdivu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u8[j] == 0) {
+ env->vfp.vreg[dest].u8[j] = UINT8_MAX;
+ } else {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] /
+ env->vfp.vreg[src1].u8[j];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u16[j] == 0) {
+ env->vfp.vreg[dest].u16[j] = UINT16_MAX;
+ } else {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ / env->vfp.vreg[src1].u16[j];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u32[j] == 0) {
+ env->vfp.vreg[dest].u32[j] = UINT32_MAX;
+ } else {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ / env->vfp.vreg[src1].u32[j];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u64[j] == 0) {
+ env->vfp.vreg[dest].u64[j] = UINT64_MAX;
+ } else {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ / env->vfp.vreg[src1].u64[j];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vdivu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint8_t)env->gpr[rs1] == 0) {
+ env->vfp.vreg[dest].u8[j] = UINT8_MAX;
+ } else {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] /
+ (uint8_t)env->gpr[rs1];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint16_t)env->gpr[rs1] == 0) {
+ env->vfp.vreg[dest].u16[j] = UINT16_MAX;
+ } else {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ / (uint16_t)env->gpr[rs1];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint32_t)env->gpr[rs1] == 0) {
+ env->vfp.vreg[dest].u32[j] = UINT32_MAX;
+ } else {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ / (uint32_t)env->gpr[rs1];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint64_t)extend_gpr(env->gpr[rs1]) == 0) {
+ env->vfp.vreg[dest].u64[j] = UINT64_MAX;
+ } else {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ / (uint64_t)extend_gpr(env->gpr[rs1]);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vdiv_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s8[j] == 0) {
+ env->vfp.vreg[dest].s8[j] = -1;
+ } else if ((env->vfp.vreg[src2].s8[j] == INT8_MIN) &&
+ (env->vfp.vreg[src1].s8[j] == (int8_t)(-1))) {
+ env->vfp.vreg[dest].s8[j] = INT8_MIN;
+ } else {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j] /
+ env->vfp.vreg[src1].s8[j];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s16[j] == 0) {
+ env->vfp.vreg[dest].s16[j] = -1;
+ } else if ((env->vfp.vreg[src2].s16[j] == INT16_MIN) &&
+ (env->vfp.vreg[src1].s16[j] == (int16_t)(-1))) {
+ env->vfp.vreg[dest].s16[j] = INT16_MIN;
+ } else {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j]
+ / env->vfp.vreg[src1].s16[j];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s32[j] == 0) {
+ env->vfp.vreg[dest].s32[j] = -1;
+ } else if ((env->vfp.vreg[src2].s32[j] == INT32_MIN) &&
+ (env->vfp.vreg[src1].s32[j] == (int32_t)(-1))) {
+ env->vfp.vreg[dest].s32[j] = INT32_MIN;
+ } else {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j]
+ / env->vfp.vreg[src1].s32[j];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s64[j] == 0) {
+ env->vfp.vreg[dest].s64[j] = -1;
+ } else if ((env->vfp.vreg[src2].s64[j] == INT64_MIN) &&
+ (env->vfp.vreg[src1].s64[j] == (int64_t)(-1))) {
+ env->vfp.vreg[dest].s64[j] = INT64_MIN;
+ } else {
+ env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j]
+ / env->vfp.vreg[src1].s64[j];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vdiv_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int8_t)env->gpr[rs1] == 0) {
+ env->vfp.vreg[dest].s8[j] = -1;
+ } else if ((env->vfp.vreg[src2].s8[j] == INT8_MIN) &&
+ ((int8_t)env->gpr[rs1] == (int8_t)(-1))) {
+ env->vfp.vreg[dest].s8[j] = INT8_MIN;
+ } else {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j] /
+ (int8_t)env->gpr[rs1];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int16_t)env->gpr[rs1] == 0) {
+ env->vfp.vreg[dest].s16[j] = -1;
+ } else if ((env->vfp.vreg[src2].s16[j] == INT16_MIN) &&
+ ((int16_t)env->gpr[rs1] == (int16_t)(-1))) {
+ env->vfp.vreg[dest].s16[j] = INT16_MIN;
+ } else {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j]
+ / (int16_t)env->gpr[rs1];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int32_t)env->gpr[rs1] == 0) {
+ env->vfp.vreg[dest].s32[j] = -1;
+ } else if ((env->vfp.vreg[src2].s32[j] == INT32_MIN) &&
+ ((int32_t)env->gpr[rs1] == (int32_t)(-1))) {
+ env->vfp.vreg[dest].s32[j] = INT32_MIN;
+ } else {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j]
+ / (int32_t)env->gpr[rs1];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int64_t)extend_gpr(env->gpr[rs1]) == 0) {
+ env->vfp.vreg[dest].s64[j] = -1;
+ } else if ((env->vfp.vreg[src2].s64[j] == INT64_MIN) &&
+ ((int64_t)extend_gpr(env->gpr[rs1]) == (int64_t)(-1))) {
+ env->vfp.vreg[dest].s64[j] = INT64_MIN;
+ } else {
+ env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j]
+ / (int64_t)extend_gpr(env->gpr[rs1]);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vremu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u8[j] == 0) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j];
+ } else {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] %
+ env->vfp.vreg[src1].u8[j];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u16[j] == 0) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j];
+ } else {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ % env->vfp.vreg[src1].u16[j];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u32[j] == 0) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j];
+ } else {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ % env->vfp.vreg[src1].u32[j];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].u64[j] == 0) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j];
+ } else {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ % env->vfp.vreg[src1].u64[j];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vremu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint8_t)env->gpr[rs1] == 0) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j];
+ } else {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src2].u8[j] %
+ (uint8_t)env->gpr[rs1];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint16_t)env->gpr[rs1] == 0) {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j];
+ } else {
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src2].u16[j]
+ % (uint16_t)env->gpr[rs1];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint32_t)env->gpr[rs1] == 0) {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j];
+ } else {
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src2].u32[j]
+ % (uint32_t)env->gpr[rs1];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((uint64_t)extend_gpr(env->gpr[rs1]) == 0) {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j];
+ } else {
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src2].u64[j]
+ % (uint64_t)extend_gpr(env->gpr[rs1]);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vrem_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s8[j] == 0) {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j];
+ } else if ((env->vfp.vreg[src2].s8[j] == INT8_MIN) &&
+ (env->vfp.vreg[src1].s8[j] == (int8_t)(-1))) {
+ env->vfp.vreg[dest].s8[j] = 0;
+ } else {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j] %
+ env->vfp.vreg[src1].s8[j];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s16[j] == 0) {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j];
+ } else if ((env->vfp.vreg[src2].s16[j] == INT16_MIN) &&
+ (env->vfp.vreg[src1].s16[j] == (int16_t)(-1))) {
+ env->vfp.vreg[dest].s16[j] = 0;
+ } else {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j]
+ % env->vfp.vreg[src1].s16[j];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s32[j] == 0) {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j];
+ } else if ((env->vfp.vreg[src2].s32[j] == INT32_MIN) &&
+ (env->vfp.vreg[src1].s32[j] == (int32_t)(-1))) {
+ env->vfp.vreg[dest].s32[j] = 0;
+ } else {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j]
+ % env->vfp.vreg[src1].s32[j];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (env->vfp.vreg[src1].s64[j] == 0) {
+ env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j];
+ } else if ((env->vfp.vreg[src2].s64[j] == INT64_MIN) &&
+ (env->vfp.vreg[src1].s64[j] == (int64_t)(-1))) {
+ env->vfp.vreg[dest].s64[j] = 0;
+ } else {
+ env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j]
+ % env->vfp.vreg[src1].s64[j];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vrem_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int8_t)env->gpr[rs1] == 0) {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j];
+ } else if ((env->vfp.vreg[src2].s8[j] == INT8_MIN) &&
+ ((int8_t)env->gpr[rs1] == (int8_t)(-1))) {
+ env->vfp.vreg[dest].s8[j] = 0;
+ } else {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j] %
+ (int8_t)env->gpr[rs1];
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int16_t)env->gpr[rs1] == 0) {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j];
+ } else if ((env->vfp.vreg[src2].s16[j] == INT16_MIN) &&
+ ((int16_t)env->gpr[rs1] == (int16_t)(-1))) {
+ env->vfp.vreg[dest].s16[j] = 0;
+ } else {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j]
+ % (int16_t)env->gpr[rs1];
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int32_t)env->gpr[rs1] == 0) {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j];
+ } else if ((env->vfp.vreg[src2].s32[j] == INT32_MIN) &&
+ ((int32_t)env->gpr[rs1] == (int32_t)(-1))) {
+ env->vfp.vreg[dest].s32[j] = 0;
+ } else {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j]
+ % (int32_t)env->gpr[rs1];
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if ((int64_t)extend_gpr(env->gpr[rs1]) == 0) {
+ env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j];
+ } else if ((env->vfp.vreg[src2].s64[j] == INT64_MIN) &&
+ ((int64_t)extend_gpr(env->gpr[rs1]) == (int64_t)(-1))) {
+ env->vfp.vreg[dest].s64[j] = 0;
+ } else {
+ env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j]
+ % (int64_t)extend_gpr(env->gpr[rs1]);
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmacc_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] += env->vfp.vreg[src1].s8[j]
+ * env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] += env->vfp.vreg[src1].s16[j]
+ * env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] += env->vfp.vreg[src1].s32[j]
+ * env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] += env->vfp.vreg[src1].s64[j]
+ * env->vfp.vreg[src2].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmacc_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] += env->gpr[rs1]
+ * env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] += env->gpr[rs1]
+ * env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] += env->gpr[rs1]
+ * env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] +=
+ (int64_t)extend_gpr(env->gpr[rs1])
+ * env->vfp.vreg[src2].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vnmsac_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] -= env->vfp.vreg[src1].s8[j]
+ * env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] -= env->vfp.vreg[src1].s16[j]
+ * env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] -= env->vfp.vreg[src1].s32[j]
+ * env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] -= env->vfp.vreg[src1].s64[j]
+ * env->vfp.vreg[src2].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vnmsac_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] -= env->gpr[rs1]
+ * env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] -= env->gpr[rs1]
+ * env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] -= env->gpr[rs1]
+ * env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] -=
+ (int64_t)extend_gpr(env->gpr[rs1])
+ * env->vfp.vreg[src2].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src1].s8[j]
+ * env->vfp.vreg[dest].s8[j]
+ + env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src1].s16[j]
+ * env->vfp.vreg[dest].s16[j]
+ + env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src1].s32[j]
+ * env->vfp.vreg[dest].s32[j]
+ + env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src1].s64[j]
+ * env->vfp.vreg[dest].s64[j]
+ + env->vfp.vreg[src2].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmadd_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = env->gpr[rs1]
+ * env->vfp.vreg[dest].s8[j]
+ + env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = env->gpr[rs1]
+ * env->vfp.vreg[dest].s16[j]
+ + env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = env->gpr[rs1]
+ * env->vfp.vreg[dest].s32[j]
+ + env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] =
+ (int64_t)extend_gpr(env->gpr[rs1])
+ * env->vfp.vreg[dest].s64[j]
+ + env->vfp.vreg[src2].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vnmsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j]
+ - env->vfp.vreg[src1].s8[j]
+ * env->vfp.vreg[dest].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j]
+ - env->vfp.vreg[src1].s16[j]
+ * env->vfp.vreg[dest].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j]
+ - env->vfp.vreg[src1].s32[j]
+ * env->vfp.vreg[dest].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j]
+ - env->vfp.vreg[src1].s64[j]
+ * env->vfp.vreg[dest].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vnmsub_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = env->vfp.vreg[src2].s8[j]
+ - env->gpr[rs1]
+ * env->vfp.vreg[dest].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = env->vfp.vreg[src2].s16[j]
+ - env->gpr[rs1]
+ * env->vfp.vreg[dest].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = env->vfp.vreg[src2].s32[j]
+ - env->gpr[rs1]
+ * env->vfp.vreg[dest].s32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = env->vfp.vreg[src2].s64[j]
+ - (int64_t)extend_gpr(env->gpr[rs1])
+ * env->vfp.vreg[dest].s64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwmulu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] =
+ (uint16_t)env->vfp.vreg[src1].u8[j] *
+ (uint16_t)env->vfp.vreg[src2].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] =
+ (uint32_t)env->vfp.vreg[src1].u16[j] *
+ (uint32_t)env->vfp.vreg[src2].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] =
+ (uint64_t)env->vfp.vreg[src1].u32[j] *
+ (uint64_t)env->vfp.vreg[src2].u32[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vwmulu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] =
+ (uint16_t)env->vfp.vreg[src2].u8[j] *
+ (uint16_t)((uint8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] =
+ (uint32_t)env->vfp.vreg[src2].u16[j] *
+ (uint32_t)((uint16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] =
+ (uint64_t)env->vfp.vreg[src2].u32[j] *
+ (uint64_t)((uint32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwmulsu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] =
+ (int16_t)env->vfp.vreg[src2].s8[j] *
+ (uint16_t)env->vfp.vreg[src1].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] =
+ (int32_t)env->vfp.vreg[src2].s16[j] *
+ (uint32_t)env->vfp.vreg[src1].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] =
+ (int64_t)env->vfp.vreg[src2].s32[j] *
+ (uint64_t)env->vfp.vreg[src1].u32[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vwmulsu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] =
+ (int16_t)((int8_t)env->vfp.vreg[src2].s8[j]) *
+ (uint16_t)((uint8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] =
+ (int32_t)((int16_t)env->vfp.vreg[src2].s16[j]) *
+ (uint32_t)((uint16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] =
+ (int64_t)((int32_t)env->vfp.vreg[src2].s32[j]) *
+ (uint64_t)((uint32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwmul_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] =
+ (int16_t)env->vfp.vreg[src1].s8[j] *
+ (int16_t)env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] =
+ (int32_t)env->vfp.vreg[src1].s16[j] *
+ (int32_t)env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] =
+ (int64_t)env->vfp.vreg[src1].s32[j] *
+ (int64_t)env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vwmul_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] =
+ (int16_t)((int8_t)env->vfp.vreg[src2].s8[j]) *
+ (int16_t)((int8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] =
+ (int32_t)((int16_t)env->vfp.vreg[src2].s16[j]) *
+ (int32_t)((int16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] =
+ (int64_t)((int32_t)env->vfp.vreg[src2].s32[j]) *
+ (int64_t)((int32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwmaccu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] +=
+ (uint16_t)env->vfp.vreg[src1].u8[j] *
+ (uint16_t)env->vfp.vreg[src2].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] +=
+ (uint32_t)env->vfp.vreg[src1].u16[j] *
+ (uint32_t)env->vfp.vreg[src2].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] +=
+ (uint64_t)env->vfp.vreg[src1].u32[j] *
+ (uint64_t)env->vfp.vreg[src2].u32[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vwmaccu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] +=
+ (uint16_t)env->vfp.vreg[src2].u8[j] *
+ (uint16_t)((uint8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] +=
+ (uint32_t)env->vfp.vreg[src2].u16[j] *
+ (uint32_t)((uint16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] +=
+ (uint64_t)env->vfp.vreg[src2].u32[j] *
+ (uint64_t)((uint32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwmaccsu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] +=
+ (int16_t)env->vfp.vreg[src1].s8[j]
+ * (uint16_t)env->vfp.vreg[src2].u8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] +=
+ (int32_t)env->vfp.vreg[src1].s16[j] *
+ (uint32_t)env->vfp.vreg[src2].u16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] +=
+ (int64_t)env->vfp.vreg[src1].s32[j] *
+ (uint64_t)env->vfp.vreg[src2].u32[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vwmaccsu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] +=
+ (uint16_t)((uint8_t)env->vfp.vreg[src2].u8[j]) *
+ (int16_t)((int8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] +=
+ (uint32_t)((uint16_t)env->vfp.vreg[src2].u16[j]) *
+ (int32_t)((int16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] +=
+ (uint64_t)((uint32_t)env->vfp.vreg[src2].u32[j]) *
+ (int64_t)((int32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwmaccus_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] +=
+ (int16_t)((int8_t)env->vfp.vreg[src2].s8[j]) *
+ (uint16_t)((uint8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] +=
+ (int32_t)((int16_t)env->vfp.vreg[src2].s16[j]) *
+ (uint32_t)((uint16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] +=
+ (int64_t)((int32_t)env->vfp.vreg[src2].s32[j]) *
+ (uint64_t)((uint32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vwmacc_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] +=
+ (int16_t)env->vfp.vreg[src1].s8[j]
+ * (int16_t)env->vfp.vreg[src2].s8[j];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] +=
+ (int32_t)env->vfp.vreg[src1].s16[j] *
+ (int32_t)env->vfp.vreg[src2].s16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] +=
+ (int64_t)env->vfp.vreg[src1].s32[j] *
+ (int64_t)env->vfp.vreg[src2].s32[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vwmacc_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, k, vl;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] +=
+ (int16_t)((int8_t)env->vfp.vreg[src2].s8[j]) *
+ (int16_t)((int8_t)env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] +=
+ (int32_t)((int16_t)env->vfp.vreg[src2].s16[j]) *
+ (int32_t)((int16_t)env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] +=
+ (int64_t)((int32_t)env->vfp.vreg[src2].s32[j]) *
+ (int64_t)((int32_t)env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
+void VECTOR_HELPER(vmerge_vvm)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl, idx, pos;
+ uint32_t lmul, width, src1, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src1 = rs1 + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vm == 0) {
+ vector_get_layout(env, width, lmul, i, &idx, &pos);
+ if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src2].u8[j];
+ } else {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src1].u8[j];
+ }
+ } else {
+ if (rs2 != 0) {
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ }
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src1].u8[j];
+ }
+ break;
+ case 16:
+ if (vm == 0) {
+ vector_get_layout(env, width, lmul, i, &idx, &pos);
+ if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src2].u16[j];
+ } else {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src1].u16[j];
+ }
+ } else {
+ if (rs2 != 0) {
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ }
+ env->vfp.vreg[dest].u16[j] = env->vfp.vreg[src1].u16[j];
+ }
+ break;
+ case 32:
+ if (vm == 0) {
+ vector_get_layout(env, width, lmul, i, &idx, &pos);
+ if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src2].u32[j];
+ } else {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src1].u32[j];
+ }
+ } else {
+ if (rs2 != 0) {
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ }
+ env->vfp.vreg[dest].u32[j] = env->vfp.vreg[src1].u32[j];
+ }
+ break;
+ case 64:
+ if (vm == 0) {
+ vector_get_layout(env, width, lmul, i, &idx, &pos);
+ if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src2].u64[j];
+ } else {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src1].u64[j];
+ }
+ } else {
+ if (rs2 != 0) {
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ }
+ env->vfp.vreg[dest].u64[j] = env->vfp.vreg[src1].u64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmerge_vxm)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl, idx, pos;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vm == 0) {
+ vector_get_layout(env, width, lmul, i, &idx, &pos);
+ if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src2].u8[j];
+ } else {
+ env->vfp.vreg[dest].u8[j] = env->gpr[rs1];
+ }
+ } else {
+ if (rs2 != 0) {
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ }
+ env->vfp.vreg[dest].u8[j] = env->gpr[rs1];
+ }
+ break;
+ case 16:
+ if (vm == 0) {
+ vector_get_layout(env, width, lmul, i, &idx, &pos);
+ if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src2].u16[j];
+ } else {
+ env->vfp.vreg[dest].u16[j] = env->gpr[rs1];
+ }
+ } else {
+ if (rs2 != 0) {
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ }
+ env->vfp.vreg[dest].u16[j] = env->gpr[rs1];
+ }
+ break;
+ case 32:
+ if (vm == 0) {
+ vector_get_layout(env, width, lmul, i, &idx, &pos);
+ if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src2].u32[j];
+ } else {
+ env->vfp.vreg[dest].u32[j] = env->gpr[rs1];
+ }
+ } else {
+ if (rs2 != 0) {
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ }
+ env->vfp.vreg[dest].u32[j] = env->gpr[rs1];
+ }
+ break;
+ case 64:
+ if (vm == 0) {
+ vector_get_layout(env, width, lmul, i, &idx, &pos);
+ if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src2].u64[j];
+ } else {
+ env->vfp.vreg[dest].u64[j] =
+ (uint64_t)extend_gpr(env->gpr[rs1]);
+ }
+ } else {
+ if (rs2 != 0) {
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ }
+ env->vfp.vreg[dest].u64[j] =
+ (uint64_t)extend_gpr(env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+void VECTOR_HELPER(vmerge_vim)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int i, j, vl, idx, pos;
+ uint32_t lmul, width, src2, dest, vlmax;
+
+ vl = env->vfp.vl;
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vm == 0) {
+ vector_get_layout(env, width, lmul, i, &idx, &pos);
+ if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src2].u8[j];
+ } else {
+ env->vfp.vreg[dest].u8[j] =
+ (uint8_t)sign_extend(rs1, 5);
+ }
+ } else {
+ if (rs2 != 0) {
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ }
+ env->vfp.vreg[dest].u8[j] = (uint8_t)sign_extend(rs1, 5);
+ }
+ break;
+ case 16:
+ if (vm == 0) {
+ vector_get_layout(env, width, lmul, i, &idx, &pos);
+ if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src2].u16[j];
+ } else {
+ env->vfp.vreg[dest].u16[j] =
+ (uint16_t)sign_extend(rs1, 5);
+ }
+ } else {
+ if (rs2 != 0) {
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ }
+ env->vfp.vreg[dest].u16[j] = (uint16_t)sign_extend(rs1, 5);
+ }
+ break;
+ case 32:
+ if (vm == 0) {
+ vector_get_layout(env, width, lmul, i, &idx, &pos);
+ if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src2].u32[j];
+ } else {
+ env->vfp.vreg[dest].u32[j] =
+ (uint32_t)sign_extend(rs1, 5);
+ }
+ } else {
+ if (rs2 != 0) {
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ }
+ env->vfp.vreg[dest].u32[j] = (uint32_t)sign_extend(rs1, 5);
+ }
+ break;
+ case 64:
+ if (vm == 0) {
+ vector_get_layout(env, width, lmul, i, &idx, &pos);
+ if (((env->vfp.vreg[0].u8[idx] >> pos) & 0x1) == 0) {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src2].u64[j];
+ } else {
+ env->vfp.vreg[dest].u64[j] =
+ (uint64_t)sign_extend(rs1, 5);
+ }
+ } else {
+ if (rs2 != 0) {
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ }
+ env->vfp.vreg[dest].u64[j] = (uint64_t)sign_extend(rs1, 5);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ break;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+}
+
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 12/17] RISC-V: add vector extension fixed point instructions
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (10 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 11/17] RISC-V: add vector extension integer instructions part4, mul/div/merge liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 13/17] RISC-V: add vector extension float instruction part1, add/sub/mul/div liuzhiwei
` (5 subsequent siblings)
17 siblings, 0 replies; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/helper.h | 37 +
target/riscv/insn32.decode | 37 +
target/riscv/insn_trans/trans_rvv.inc.c | 37 +
target/riscv/vector_helper.c | 3388 +++++++++++++++++++++++++++++++
4 files changed, 3499 insertions(+)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index ab31ef7..ff6002e 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -270,5 +270,42 @@ DEF_HELPER_5(vector_vmerge_vvm, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vmerge_vxm, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vmerge_vim, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsaddu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsaddu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsaddu_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsadd_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsadd_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsadd_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vssubu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vssubu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vssub_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vssub_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vaadd_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vaadd_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vaadd_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vasub_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vasub_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsmul_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vsmul_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsmaccu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsmaccu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsmacc_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsmacc_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsmaccsu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsmaccsu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwsmaccus_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vssrl_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vssrl_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vssrl_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vssra_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vssra_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vssra_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnclipu_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnclipu_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnclipu_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnclip_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnclip_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vnclip_vi, void, env, i32, i32, i32, i32)
+
DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 6db18c5..a82e53e 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -410,5 +410,42 @@ vmerge_vvm 010111 . ..... ..... 000 ..... 1010111 @r_vm
vmerge_vxm 010111 . ..... ..... 100 ..... 1010111 @r_vm
vmerge_vim 010111 . ..... ..... 011 ..... 1010111 @r_vm
+vsaddu_vv 100000 . ..... ..... 000 ..... 1010111 @r_vm
+vsaddu_vx 100000 . ..... ..... 100 ..... 1010111 @r_vm
+vsaddu_vi 100000 . ..... ..... 011 ..... 1010111 @r_vm
+vsadd_vv 100001 . ..... ..... 000 ..... 1010111 @r_vm
+vsadd_vx 100001 . ..... ..... 100 ..... 1010111 @r_vm
+vsadd_vi 100001 . ..... ..... 011 ..... 1010111 @r_vm
+vssubu_vv 100010 . ..... ..... 000 ..... 1010111 @r_vm
+vssubu_vx 100010 . ..... ..... 100 ..... 1010111 @r_vm
+vssub_vv 100011 . ..... ..... 000 ..... 1010111 @r_vm
+vssub_vx 100011 . ..... ..... 100 ..... 1010111 @r_vm
+vaadd_vv 100100 . ..... ..... 000 ..... 1010111 @r_vm
+vaadd_vx 100100 . ..... ..... 100 ..... 1010111 @r_vm
+vaadd_vi 100100 . ..... ..... 011 ..... 1010111 @r_vm
+vasub_vv 100110 . ..... ..... 000 ..... 1010111 @r_vm
+vasub_vx 100110 . ..... ..... 100 ..... 1010111 @r_vm
+vsmul_vv 100111 . ..... ..... 000 ..... 1010111 @r_vm
+vsmul_vx 100111 . ..... ..... 100 ..... 1010111 @r_vm
+vwsmaccu_vv 111100 . ..... ..... 000 ..... 1010111 @r_vm
+vwsmaccu_vx 111100 . ..... ..... 100 ..... 1010111 @r_vm
+vwsmacc_vv 111101 . ..... ..... 000 ..... 1010111 @r_vm
+vwsmacc_vx 111101 . ..... ..... 100 ..... 1010111 @r_vm
+vwsmaccsu_vv 111110 . ..... ..... 000 ..... 1010111 @r_vm
+vwsmaccsu_vx 111110 . ..... ..... 100 ..... 1010111 @r_vm
+vwsmaccus_vx 111111 . ..... ..... 100 ..... 1010111 @r_vm
+vssrl_vv 101010 . ..... ..... 000 ..... 1010111 @r_vm
+vssrl_vx 101010 . ..... ..... 100 ..... 1010111 @r_vm
+vssrl_vi 101010 . ..... ..... 011 ..... 1010111 @r_vm
+vssra_vv 101011 . ..... ..... 000 ..... 1010111 @r_vm
+vssra_vx 101011 . ..... ..... 100 ..... 1010111 @r_vm
+vssra_vi 101011 . ..... ..... 011 ..... 1010111 @r_vm
+vnclipu_vv 101110 . ..... ..... 000 ..... 1010111 @r_vm
+vnclipu_vx 101110 . ..... ..... 100 ..... 1010111 @r_vm
+vnclipu_vi 101110 . ..... ..... 011 ..... 1010111 @r_vm
+vnclip_vv 101111 . ..... ..... 000 ..... 1010111 @r_vm
+vnclip_vx 101111 . ..... ..... 100 ..... 1010111 @r_vm
+vnclip_vi 101111 . ..... ..... 011 ..... 1010111 @r_vm
+
vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index 1ba52e7..d650e8c 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -299,5 +299,42 @@ GEN_VECTOR_R_VM(vmerge_vvm)
GEN_VECTOR_R_VM(vmerge_vxm)
GEN_VECTOR_R_VM(vmerge_vim)
+GEN_VECTOR_R_VM(vsaddu_vv)
+GEN_VECTOR_R_VM(vsaddu_vx)
+GEN_VECTOR_R_VM(vsaddu_vi)
+GEN_VECTOR_R_VM(vsadd_vv)
+GEN_VECTOR_R_VM(vsadd_vx)
+GEN_VECTOR_R_VM(vsadd_vi)
+GEN_VECTOR_R_VM(vssubu_vv)
+GEN_VECTOR_R_VM(vssubu_vx)
+GEN_VECTOR_R_VM(vssub_vv)
+GEN_VECTOR_R_VM(vssub_vx)
+GEN_VECTOR_R_VM(vaadd_vv)
+GEN_VECTOR_R_VM(vaadd_vx)
+GEN_VECTOR_R_VM(vaadd_vi)
+GEN_VECTOR_R_VM(vasub_vv)
+GEN_VECTOR_R_VM(vasub_vx)
+GEN_VECTOR_R_VM(vsmul_vv)
+GEN_VECTOR_R_VM(vsmul_vx)
+GEN_VECTOR_R_VM(vwsmaccu_vv)
+GEN_VECTOR_R_VM(vwsmaccu_vx)
+GEN_VECTOR_R_VM(vwsmacc_vv)
+GEN_VECTOR_R_VM(vwsmacc_vx)
+GEN_VECTOR_R_VM(vwsmaccsu_vv)
+GEN_VECTOR_R_VM(vwsmaccsu_vx)
+GEN_VECTOR_R_VM(vwsmaccus_vx)
+GEN_VECTOR_R_VM(vssrl_vv)
+GEN_VECTOR_R_VM(vssrl_vx)
+GEN_VECTOR_R_VM(vssrl_vi)
+GEN_VECTOR_R_VM(vssra_vv)
+GEN_VECTOR_R_VM(vssra_vx)
+GEN_VECTOR_R_VM(vssra_vi)
+GEN_VECTOR_R_VM(vnclipu_vv)
+GEN_VECTOR_R_VM(vnclipu_vx)
+GEN_VECTOR_R_VM(vnclipu_vi)
+GEN_VECTOR_R_VM(vnclip_vv)
+GEN_VECTOR_R_VM(vnclip_vx)
+GEN_VECTOR_R_VM(vnclip_vi)
+
GEN_VECTOR_R2_ZIMM(vsetvli)
GEN_VECTOR_R(vsetvl)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 49f1cb8..2292fa5 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -75,6 +75,844 @@ static target_ulong vector_get_index(CPURISCVState *env, int rs1, int rs2,
return 0;
}
+/* ADD/SUB/COMPARE instructions. */
+static inline uint8_t sat_add_u8(CPURISCVState *env, uint8_t a, uint8_t b)
+{
+ uint8_t res = a + b;
+ if (res < a) {
+ res = UINT8_MAX;
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint16_t sat_add_u16(CPURISCVState *env, uint16_t a, uint16_t b)
+{
+ uint16_t res = a + b;
+ if (res < a) {
+ res = UINT16_MAX;
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint32_t sat_add_u32(CPURISCVState *env, uint32_t a, uint32_t b)
+{
+ uint32_t res = a + b;
+ if (res < a) {
+ res = UINT32_MAX;
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint64_t sat_add_u64(CPURISCVState *env, uint64_t a, uint64_t b)
+{
+ uint64_t res = a + b;
+ if (res < a) {
+ res = UINT64_MAX;
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint8_t sat_add_s8(CPURISCVState *env, uint8_t a, uint8_t b)
+{
+ uint8_t res = a + b;
+ if (((res ^ a) & SIGNBIT8) && !((a ^ b) & SIGNBIT8)) {
+ res = ~(((int8_t)a >> 7) ^ SIGNBIT8);
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint16_t sat_add_s16(CPURISCVState *env, uint16_t a, uint16_t b)
+{
+ uint16_t res = a + b;
+ if (((res ^ a) & SIGNBIT16) && !((a ^ b) & SIGNBIT16)) {
+ res = ~(((int16_t)a >> 15) ^ SIGNBIT16);
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint32_t sat_add_s32(CPURISCVState *env, uint32_t a, uint32_t b)
+{
+ uint32_t res = a + b;
+ if (((res ^ a) & SIGNBIT32) && !((a ^ b) & SIGNBIT32)) {
+ res = ~(((int32_t)a >> 31) ^ SIGNBIT32);
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint64_t sat_add_s64(CPURISCVState *env, uint64_t a, uint64_t b)
+{
+ uint64_t res = a + b;
+ if (((res ^ a) & SIGNBIT64) && !((a ^ b) & SIGNBIT64)) {
+ res = ~(((int64_t)a >> 63) ^ SIGNBIT64);
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint8_t sat_sub_u8(CPURISCVState *env, uint8_t a, uint8_t b)
+{
+ uint8_t res = a - b;
+ if (res > a) {
+ res = 0;
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint16_t sat_sub_u16(CPURISCVState *env, uint16_t a, uint16_t b)
+{
+ uint16_t res = a - b;
+ if (res > a) {
+ res = 0;
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint32_t sat_sub_u32(CPURISCVState *env, uint32_t a, uint32_t b)
+{
+ uint32_t res = a - b;
+ if (res > a) {
+ res = 0;
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint64_t sat_sub_u64(CPURISCVState *env, uint64_t a, uint64_t b)
+{
+ uint64_t res = a - b;
+ if (res > a) {
+ res = 0;
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint8_t sat_sub_s8(CPURISCVState *env, uint8_t a, uint8_t b)
+{
+ uint8_t res = a - b;
+ if (((res ^ a) & SIGNBIT8) && ((a ^ b) & SIGNBIT8)) {
+ res = ~(((int8_t)a >> 7) ^ SIGNBIT8);
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint16_t sat_sub_s16(CPURISCVState *env, uint16_t a, uint16_t b)
+{
+ uint16_t res = a - b;
+ if (((res ^ a) & SIGNBIT16) && ((a ^ b) & SIGNBIT16)) {
+ res = ~(((int16_t)a >> 15) ^ SIGNBIT16);
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint32_t sat_sub_s32(CPURISCVState *env, uint32_t a, uint32_t b)
+{
+ uint32_t res = a - b;
+ if (((res ^ a) & SIGNBIT32) && ((a ^ b) & SIGNBIT32)) {
+ res = ~(((int32_t)a >> 31) ^ SIGNBIT32);
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static inline uint64_t sat_sub_s64(CPURISCVState *env, uint64_t a, uint64_t b)
+{
+ uint64_t res = a - b;
+ if (((res ^ a) & SIGNBIT64) && ((a ^ b) & SIGNBIT64)) {
+ res = ~(((int64_t)a >> 63) ^ SIGNBIT64);
+ env->vfp.vxsat = 0x1;
+ }
+ return res;
+}
+
+static uint64_t fix_data_round(CPURISCVState *env, uint64_t result,
+ uint8_t shift)
+{
+ uint64_t lsb_1 = (uint64_t)1 << shift;
+ int mod = env->vfp.vxrm;
+ int mask = ((uint64_t)1 << shift) - 1;
+
+ if (mod == 0x0) { /* rnu */
+ return lsb_1 >> 1;
+ } else if (mod == 0x1) { /* rne */
+ if ((result & mask) > (lsb_1 >> 1) ||
+ (((result & mask) == (lsb_1 >> 1)) &&
+ (((result >> shift) & 0x1)) == 1)) {
+ return lsb_1 >> 1;
+ }
+ } else if (mod == 0x3) { /* rod */
+ if (((result & mask) >= 0x1) && (((result >> shift) & 0x1) == 0)) {
+ return lsb_1;
+ }
+ }
+ return 0;
+}
+
+static int8_t saturate_s8(CPURISCVState *env, int16_t res)
+{
+ if (res > INT8_MAX) {
+ env->vfp.vxsat = 0x1;
+ return INT8_MAX;
+ } else if (res < INT8_MIN) {
+ env->vfp.vxsat = 0x1;
+ return INT8_MIN;
+ } else {
+ return res;
+ }
+}
+
+static uint8_t saturate_u8(CPURISCVState *env, uint16_t res)
+{
+ if (res > UINT8_MAX) {
+ env->vfp.vxsat = 0x1;
+ return UINT8_MAX;
+ } else {
+ return res;
+ }
+}
+
+static uint16_t saturate_u16(CPURISCVState *env, uint32_t res)
+{
+ if (res > UINT16_MAX) {
+ env->vfp.vxsat = 0x1;
+ return UINT16_MAX;
+ } else {
+ return res;
+ }
+}
+
+static uint32_t saturate_u32(CPURISCVState *env, uint64_t res)
+{
+ if (res > UINT32_MAX) {
+ env->vfp.vxsat = 0x1;
+ return UINT32_MAX;
+ } else {
+ return res;
+ }
+}
+
+static int16_t saturate_s16(CPURISCVState *env, int32_t res)
+{
+ if (res > INT16_MAX) {
+ env->vfp.vxsat = 0x1;
+ return INT16_MAX;
+ } else if (res < INT16_MIN) {
+ env->vfp.vxsat = 0x1;
+ return INT16_MIN;
+ } else {
+ return res;
+ }
+}
+
+static int32_t saturate_s32(CPURISCVState *env, int64_t res)
+{
+ if (res > INT32_MAX) {
+ env->vfp.vxsat = 0x1;
+ return INT32_MAX;
+ } else if (res < INT32_MIN) {
+ env->vfp.vxsat = 0x1;
+ return INT32_MIN;
+ } else {
+ return res;
+ }
+}
+static uint16_t vwsmaccu_8(CPURISCVState *env, uint8_t a, uint8_t b,
+ uint16_t c)
+{
+ uint16_t round, res;
+ uint16_t product = (uint16_t)a * (uint16_t)b;
+
+ round = (uint16_t)fix_data_round(env, (uint64_t)product, 4);
+ res = (round + product) >> 4;
+ return sat_add_u16(env, c, res);
+}
+
+static uint32_t vwsmaccu_16(CPURISCVState *env, uint16_t a, uint16_t b,
+ uint32_t c)
+{
+ uint32_t round, res;
+ uint32_t product = (uint32_t)a * (uint32_t)b;
+
+ round = (uint32_t)fix_data_round(env, (uint64_t)product, 8);
+ res = (round + product) >> 8;
+ return sat_add_u32(env, c, res);
+}
+
+static uint64_t vwsmaccu_32(CPURISCVState *env, uint32_t a, uint32_t b,
+ uint64_t c)
+{
+ uint64_t round, res;
+ uint64_t product = (uint64_t)a * (uint64_t)b;
+
+ round = (uint64_t)fix_data_round(env, (uint64_t)product, 16);
+ res = (round + product) >> 16;
+ return sat_add_u64(env, c, res);
+}
+
+static int16_t vwsmacc_8(CPURISCVState *env, int8_t a, int8_t b,
+ int16_t c)
+{
+ int16_t round, res;
+ int16_t product = (int16_t)a * (int16_t)b;
+
+ round = (int16_t)fix_data_round(env, (uint64_t)product, 4);
+ res = (int16_t)(round + product) >> 4;
+ return sat_add_s16(env, c, res);
+}
+
+static int32_t vwsmacc_16(CPURISCVState *env, int16_t a, int16_t b,
+ int32_t c)
+{
+ int32_t round, res;
+ int32_t product = (int32_t)a * (int32_t)b;
+
+ round = (int32_t)fix_data_round(env, (uint64_t)product, 8);
+ res = (int32_t)(round + product) >> 8;
+ return sat_add_s32(env, c, res);
+}
+
+static int64_t vwsmacc_32(CPURISCVState *env, int32_t a, int32_t b,
+ int64_t c)
+{
+ int64_t round, res;
+ int64_t product = (int64_t)a * (int64_t)b;
+
+ round = (int64_t)fix_data_round(env, (uint64_t)product, 16);
+ res = (int64_t)(round + product) >> 16;
+ return sat_add_s64(env, c, res);
+}
+
+static int16_t vwsmaccsu_8(CPURISCVState *env, uint8_t a, int8_t b,
+ int16_t c)
+{
+ int16_t round, res;
+ int16_t product = (uint16_t)a * (int16_t)b;
+
+ round = (int16_t)fix_data_round(env, (uint64_t)product, 4);
+ res = (round + product) >> 4;
+ return sat_sub_s16(env, c, res);
+}
+
+static int32_t vwsmaccsu_16(CPURISCVState *env, uint16_t a, int16_t b,
+ uint32_t c)
+{
+ int32_t round, res;
+ int32_t product = (uint32_t)a * (int32_t)b;
+
+ round = (int32_t)fix_data_round(env, (uint64_t)product, 8);
+ res = (round + product) >> 8;
+ return sat_sub_s32(env, c, res);
+}
+
+static int64_t vwsmaccsu_32(CPURISCVState *env, uint32_t a, int32_t b,
+ int64_t c)
+{
+ int64_t round, res;
+ int64_t product = (uint64_t)a * (int64_t)b;
+
+ round = (int64_t)fix_data_round(env, (uint64_t)product, 16);
+ res = (round + product) >> 16;
+ return sat_sub_s64(env, c, res);
+}
+
+static int16_t vwsmaccus_8(CPURISCVState *env, int8_t a, uint8_t b,
+ int16_t c)
+{
+ int16_t round, res;
+ int16_t product = (int16_t)a * (uint16_t)b;
+
+ round = (int16_t)fix_data_round(env, (uint64_t)product, 4);
+ res = (round + product) >> 4;
+ return sat_sub_s16(env, c, res);
+}
+
+static int32_t vwsmaccus_16(CPURISCVState *env, int16_t a, uint16_t b,
+ int32_t c)
+{
+ int32_t round, res;
+ int32_t product = (int32_t)a * (uint32_t)b;
+
+ round = (int32_t)fix_data_round(env, (uint64_t)product, 8);
+ res = (round + product) >> 8;
+ return sat_sub_s32(env, c, res);
+}
+
+static uint64_t vwsmaccus_32(CPURISCVState *env, int32_t a, uint32_t b,
+ int64_t c)
+{
+ int64_t round, res;
+ int64_t product = (int64_t)a * (uint64_t)b;
+
+ round = (int64_t)fix_data_round(env, (uint64_t)product, 16);
+ res = (round + product) >> 16;
+ return sat_sub_s64(env, c, res);
+}
+
+static int8_t vssra_8(CPURISCVState *env, int8_t a, uint8_t b)
+{
+ int16_t round, res;
+ uint8_t shift = b & 0x7;
+
+ round = (int16_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a + round) >> shift;
+
+ return res;
+}
+
+static int16_t vssra_16(CPURISCVState *env, int16_t a, uint16_t b)
+{
+ int32_t round, res;
+ uint8_t shift = b & 0xf;
+
+ round = (int32_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a + round) >> shift;
+ return res;
+}
+
+static int32_t vssra_32(CPURISCVState *env, int32_t a, uint32_t b)
+{
+ int64_t round, res;
+ uint8_t shift = b & 0x1f;
+
+ round = (int64_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a + round) >> shift;
+ return res;
+}
+
+static int64_t vssra_64(CPURISCVState *env, int64_t a, uint64_t b)
+{
+ int64_t round, res;
+ uint8_t shift = b & 0x3f;
+
+ round = (int64_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a >> (shift - 1)) + (round >> (shift - 1));
+ return res >> 1;
+}
+
+static int8_t vssrai_8(CPURISCVState *env, int8_t a, uint8_t b)
+{
+ int16_t round, res;
+
+ round = (int16_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a + round) >> b;
+ return res;
+}
+
+static int16_t vssrai_16(CPURISCVState *env, int16_t a, uint8_t b)
+{
+ int32_t round, res;
+
+ round = (int32_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a + round) >> b;
+ return res;
+}
+
+static int32_t vssrai_32(CPURISCVState *env, int32_t a, uint8_t b)
+{
+ int64_t round, res;
+
+ round = (int64_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a + round) >> b;
+ return res;
+}
+
+static int64_t vssrai_64(CPURISCVState *env, int64_t a, uint8_t b)
+{
+ int64_t round, res;
+
+ round = (int64_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a >> (b - 1)) + (round >> (b - 1));
+ return res >> 1;
+}
+
+static int8_t vnclip_16(CPURISCVState *env, int16_t a, uint8_t b)
+{
+ int16_t round, res;
+ uint8_t shift = b & 0xf;
+
+ round = (int16_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a + round) >> shift;
+
+ return saturate_s8(env, res);
+}
+
+static int16_t vnclip_32(CPURISCVState *env, int32_t a, uint16_t b)
+{
+ int32_t round, res;
+ uint8_t shift = b & 0x1f;
+
+ round = (int32_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a + round) >> shift;
+ return saturate_s16(env, res);
+}
+
+static int32_t vnclip_64(CPURISCVState *env, int64_t a, uint32_t b)
+{
+ int64_t round, res;
+ uint8_t shift = b & 0x3f;
+
+ round = (int64_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a + round) >> shift;
+
+ return saturate_s32(env, res);
+}
+
+static int8_t vnclipi_16(CPURISCVState *env, int16_t a, uint8_t b)
+{
+ int16_t round, res;
+
+ round = (int16_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a + round) >> b;
+
+ return saturate_s8(env, res);
+}
+
+static int16_t vnclipi_32(CPURISCVState *env, int32_t a, uint8_t b)
+{
+ int32_t round, res;
+
+ round = (int32_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a + round) >> b;
+
+ return saturate_s16(env, res);
+}
+
+static int32_t vnclipi_64(CPURISCVState *env, int64_t a, uint8_t b)
+{
+ int32_t round, res;
+
+ round = (int64_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a + round) >> b;
+
+ return saturate_s32(env, res);
+}
+
+static uint8_t vnclipu_16(CPURISCVState *env, uint16_t a, uint8_t b)
+{
+ uint16_t round, res;
+ uint8_t shift = b & 0xf;
+
+ round = (uint16_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a + round) >> shift;
+
+ return saturate_u8(env, res);
+}
+
+static uint16_t vnclipu_32(CPURISCVState *env, uint32_t a, uint16_t b)
+{
+ uint32_t round, res;
+ uint8_t shift = b & 0x1f;
+
+ round = (uint32_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a + round) >> shift;
+
+ return saturate_u16(env, res);
+}
+
+static uint32_t vnclipu_64(CPURISCVState *env, uint64_t a, uint32_t b)
+{
+ uint64_t round, res;
+ uint8_t shift = b & 0x3f;
+
+ round = (uint64_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a + round) >> shift;
+
+ return saturate_u32(env, res);
+}
+
+static uint8_t vnclipui_16(CPURISCVState *env, uint16_t a, uint8_t b)
+{
+ uint16_t round, res;
+
+ round = (uint16_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a + round) >> b;
+
+ return saturate_u8(env, res);
+}
+
+static uint16_t vnclipui_32(CPURISCVState *env, uint32_t a, uint8_t b)
+{
+ uint32_t round, res;
+
+ round = (uint32_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a + round) >> b;
+
+ return saturate_u16(env, res);
+}
+
+static uint32_t vnclipui_64(CPURISCVState *env, uint64_t a, uint8_t b)
+{
+ uint64_t round, res;
+
+ round = (uint64_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a + round) >> b;
+
+ return saturate_u32(env, res);
+}
+
+static uint8_t vssrl_8(CPURISCVState *env, uint8_t a, uint8_t b)
+{
+ uint16_t round, res;
+ uint8_t shift = b & 0x7;
+
+ round = (uint16_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a + round) >> shift;
+ return res;
+}
+
+static uint16_t vssrl_16(CPURISCVState *env, uint16_t a, uint16_t b)
+{
+ uint32_t round, res;
+ uint8_t shift = b & 0xf;
+
+ round = (uint32_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a + round) >> shift;
+ return res;
+}
+
+static uint32_t vssrl_32(CPURISCVState *env, uint32_t a, uint32_t b)
+{
+ uint64_t round, res;
+ uint8_t shift = b & 0x1f;
+
+ round = (uint64_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a + round) >> shift;
+ return res;
+}
+
+static uint64_t vssrl_64(CPURISCVState *env, uint64_t a, uint64_t b)
+{
+ uint64_t round, res;
+ uint8_t shift = b & 0x3f;
+
+ round = (uint64_t)fix_data_round(env, (uint64_t)a, shift);
+ res = (a >> (shift - 1)) + (round >> (shift - 1));
+ return res >> 1;
+}
+
+static uint8_t vssrli_8(CPURISCVState *env, uint8_t a, uint8_t b)
+{
+ uint16_t round, res;
+
+ round = (uint16_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a + round) >> b;
+ return res;
+}
+
+static uint16_t vssrli_16(CPURISCVState *env, uint16_t a, uint8_t b)
+{
+ uint32_t round, res;
+
+ round = (uint32_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a + round) >> b;
+ return res;
+}
+
+static uint32_t vssrli_32(CPURISCVState *env, uint32_t a, uint8_t b)
+{
+ uint64_t round, res;
+
+ round = (uint64_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a + round) >> b;
+ return res;
+}
+
+static uint64_t vssrli_64(CPURISCVState *env, uint64_t a, uint8_t b)
+{
+ uint64_t round, res;
+
+ round = (uint64_t)fix_data_round(env, (uint64_t)a, b);
+ res = (a >> (b - 1)) + (round >> (b - 1));
+ return res >> 1;
+}
+
+static int8_t vsmul_8(CPURISCVState *env, int8_t a, int8_t b)
+{
+ int16_t round;
+ int8_t res;
+ int16_t product = (int16_t)a * (int16_t)b;
+
+ if (a == INT8_MIN && b == INT8_MIN) {
+ env->vfp.vxsat = 1;
+
+ return INT8_MAX;
+ }
+
+ round = (int16_t)fix_data_round(env, (uint64_t)product, 7);
+ res = sat_add_s16(env, product, round) >> 7;
+ return res;
+}
+
+static int16_t vsmul_16(CPURISCVState *env, int16_t a, int16_t b)
+{
+ int32_t round;
+ int16_t res;
+ int32_t product = (int32_t)a * (int32_t)b;
+
+ if (a == INT16_MIN && b == INT16_MIN) {
+ env->vfp.vxsat = 1;
+
+ return INT16_MAX;
+ }
+
+ round = (int32_t)fix_data_round(env, (uint64_t)product, 15);
+ res = sat_add_s32(env, product, round) >> 15;
+ return res;
+}
+
+static int32_t vsmul_32(CPURISCVState *env, int32_t a, int32_t b)
+{
+ int64_t round;
+ int32_t res;
+ int64_t product = (int64_t)a * (int64_t)b;
+
+ if (a == INT32_MIN && b == INT32_MIN) {
+ env->vfp.vxsat = 1;
+
+ return INT32_MAX;
+ }
+
+ round = (int64_t)fix_data_round(env, (uint64_t)product, 31);
+ res = sat_add_s64(env, product, round) >> 31;
+ return res;
+}
+
+static int64_t vsmul_64(CPURISCVState *env, int64_t a, int64_t b)
+{
+ int64_t res;
+ uint64_t abs_a = a, abs_b = b;
+ uint64_t lo_64, hi_64, carry, round;
+
+ if (a == INT64_MIN && b == INT64_MIN) {
+ env->vfp.vxsat = 1;
+
+ return INT64_MAX;
+ }
+
+ if (a < 0) {
+ abs_a = ~a + 1;
+ }
+ if (b < 0) {
+ abs_b = ~b + 1;
+ }
+
+ /* first get the whole product in {hi_64, lo_64} */
+ uint64_t a_hi = abs_a >> 32;
+ uint64_t a_lo = (uint32_t)abs_a;
+ uint64_t b_hi = abs_b >> 32;
+ uint64_t b_lo = (uint32_t)abs_b;
+
+ /*
+ * abs_a * abs_b = (a_hi << 32 + a_lo) * (b_hi << 32 + b_lo)
+ * = (a_hi * b_hi) << 64 + (a_hi * b_lo) << 32 +
+ * (a_lo * b_hi) << 32 + a_lo * b_lo
+ * = {hi_64, lo_64}
+ * hi_64 = ((a_hi * b_lo) << 32 + (a_lo * b_hi) << 32 + (a_lo * b_lo)) >> 64
+ * = (a_hi * b_lo) >> 32 + (a_lo * b_hi) >> 32 + carry
+ * carry = ((uint64_t)(uint32_t)(a_hi * b_lo) +
+ * (uint64_t)(uint32_t)(a_lo * b_hi) + (a_lo * b_lo) >> 32) >> 32
+ */
+
+ lo_64 = abs_a * abs_b;
+ carry = ((uint64_t)(uint32_t)(a_hi * b_lo) +
+ (uint64_t)(uint32_t)(a_lo * b_hi) +
+ ((a_lo * b_lo) >> 32)) >> 32;
+
+ hi_64 = a_hi * b_hi +
+ ((a_hi * b_lo) >> 32) + ((a_lo * b_hi) >> 32) +
+ carry;
+
+ if ((a ^ b) & SIGNBIT64) {
+ lo_64 = ~lo_64;
+ hi_64 = ~hi_64;
+ if (lo_64 == UINT64_MAX) {
+ lo_64 = 0;
+ hi_64 += 1;
+ } else {
+ lo_64 += 1;
+ }
+ }
+
+ /* set rem and res */
+ round = fix_data_round(env, lo_64, 63);
+ if ((lo_64 + round) < lo_64) {
+ hi_64 += 1;
+ res = (hi_64 << 1);
+ } else {
+ res = (hi_64 << 1) | ((lo_64 + round) >> 63);
+ }
+
+ return res;
+}
+static inline int8_t avg_round_s8(CPURISCVState *env, int8_t a, int8_t b)
+{
+ int16_t round;
+ int8_t res;
+ int16_t sum = a + b;
+
+ round = (int16_t)fix_data_round(env, (uint64_t)sum, 1);
+ res = (sum + round) >> 1;
+
+ return res;
+}
+
+static inline int16_t avg_round_s16(CPURISCVState *env, int16_t a, int16_t b)
+{
+ int32_t round;
+ int16_t res;
+ int32_t sum = a + b;
+
+ round = (int32_t)fix_data_round(env, (uint64_t)sum, 1);
+ res = (sum + round) >> 1;
+
+ return res;
+}
+
+static inline int32_t avg_round_s32(CPURISCVState *env, int32_t a, int32_t b)
+{
+ int64_t round;
+ int32_t res;
+ int64_t sum = a + b;
+
+ round = (int64_t)fix_data_round(env, (uint64_t)sum, 1);
+ res = (sum + round) >> 1;
+
+ return res;
+}
+
+static inline int64_t avg_round_s64(CPURISCVState *env, int64_t a, int64_t b)
+{
+ int64_t rem = (a & 0x1) + (b & 0x1);
+ int64_t res = (a >> 1) + (b >> 1) + (rem >> 1);
+ int mod = env->vfp.vxrm;
+
+ if (mod == 0x0) { /* rnu */
+ if (rem == 0x1) {
+ return res + 1;
+ }
+ } else if (mod == 0x1) { /* rne */
+ if ((rem & 0x1) == 1 && ((res & 0x1) == 1)) {
+ return res + 1;
+ }
+ } else if (mod == 0x3) { /* rod */
+ if (((rem & 0x1) >= 0x1) && (res & 0x1) == 0) {
+ return res + 1;
+ }
+ }
+ return res;
+}
+
static inline bool vector_vtype_ill(CPURISCVState *env)
{
if ((env->vfp.vtype >> (sizeof(target_ulong) - 1)) & 0x1) {
@@ -13726,3 +14564,2553 @@ void VECTOR_HELPER(vmerge_vim)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
env->vfp.vstart = 0;
}
+/* vsaddu.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vsaddu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = sat_add_u8(env,
+ env->vfp.vreg[src1].u8[j], env->vfp.vreg[src2].u8[j]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = sat_add_u16(env,
+ env->vfp.vreg[src1].u16[j], env->vfp.vreg[src2].u16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = sat_add_u32(env,
+ env->vfp.vreg[src1].u32[j], env->vfp.vreg[src2].u32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = sat_add_u64(env,
+ env->vfp.vreg[src1].u64[j], env->vfp.vreg[src2].u64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vsaddu.vx vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vsaddu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = sat_add_u8(env,
+ env->vfp.vreg[src2].u8[j], env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = sat_add_u16(env,
+ env->vfp.vreg[src2].u16[j], env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = sat_add_u32(env,
+ env->vfp.vreg[src2].u32[j], env->gpr[rs1]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = sat_add_u64(env,
+ env->vfp.vreg[src2].u64[j], env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vsaddu.vi vd, vs2, imm, vm # vector-immediate */
+void VECTOR_HELPER(vsaddu_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = sat_add_u8(env,
+ env->vfp.vreg[src2].u8[j], rs1);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = sat_add_u16(env,
+ env->vfp.vreg[src2].u16[j], rs1);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = sat_add_u32(env,
+ env->vfp.vreg[src2].u32[j], rs1);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = sat_add_u64(env,
+ env->vfp.vreg[src2].u64[j], rs1);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vsadd.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vsadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = sat_add_s8(env,
+ env->vfp.vreg[src1].s8[j], env->vfp.vreg[src2].s8[j]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = sat_add_s16(env,
+ env->vfp.vreg[src1].s16[j], env->vfp.vreg[src2].s16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = sat_add_s32(env,
+ env->vfp.vreg[src1].s32[j], env->vfp.vreg[src2].s32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = sat_add_s64(env,
+ env->vfp.vreg[src1].s64[j], env->vfp.vreg[src2].s64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vsadd.vx vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vsadd_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = sat_add_s8(env,
+ env->vfp.vreg[src2].s8[j], env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = sat_add_s16(env,
+ env->vfp.vreg[src2].s16[j], env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = sat_add_s32(env,
+ env->vfp.vreg[src2].s32[j], env->gpr[rs1]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = sat_add_s64(env,
+ env->vfp.vreg[src2].s64[j], env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vsadd.vi vd, vs2, imm, vm # vector-immediate */
+void VECTOR_HELPER(vsadd_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = sat_add_s8(env,
+ env->vfp.vreg[src2].s8[j], sign_extend(rs1, 5));
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = sat_add_s16(env,
+ env->vfp.vreg[src2].s16[j], sign_extend(rs1, 5));
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = sat_add_s32(env,
+ env->vfp.vreg[src2].s32[j], sign_extend(rs1, 5));
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = sat_add_s64(env,
+ env->vfp.vreg[src2].s64[j], sign_extend(rs1, 5));
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vssubu.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vssubu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = sat_sub_u8(env,
+ env->vfp.vreg[src2].u8[j], env->vfp.vreg[src1].u8[j]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = sat_sub_u16(env,
+ env->vfp.vreg[src2].u16[j], env->vfp.vreg[src1].u16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = sat_sub_u32(env,
+ env->vfp.vreg[src2].u32[j], env->vfp.vreg[src1].u32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = sat_sub_u64(env,
+ env->vfp.vreg[src2].u64[j], env->vfp.vreg[src1].u64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vssubu.vx vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vssubu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = sat_sub_u8(env,
+ env->vfp.vreg[src2].u8[j], env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = sat_sub_u16(env,
+ env->vfp.vreg[src2].u16[j], env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = sat_sub_u32(env,
+ env->vfp.vreg[src2].u32[j], env->gpr[rs1]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = sat_sub_u64(env,
+ env->vfp.vreg[src2].u64[j], env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vssub.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vssub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = sat_sub_s8(env,
+ env->vfp.vreg[src2].s8[j], env->vfp.vreg[src1].s8[j]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = sat_sub_s16(env,
+ env->vfp.vreg[src2].s16[j], env->vfp.vreg[src1].s16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = sat_sub_s32(env,
+ env->vfp.vreg[src2].s32[j], env->vfp.vreg[src1].s32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = sat_sub_s64(env,
+ env->vfp.vreg[src2].s64[j], env->vfp.vreg[src1].s64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vssub.vx vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vssub_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = sat_sub_s8(env,
+ env->vfp.vreg[src2].s8[j], env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = sat_sub_s16(env,
+ env->vfp.vreg[src2].s16[j], env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = sat_sub_s32(env,
+ env->vfp.vreg[src2].s32[j], env->gpr[rs1]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = sat_sub_s64(env,
+ env->vfp.vreg[src2].s64[j], env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vaadd.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vaadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = avg_round_s8(env,
+ env->vfp.vreg[src1].s8[j], env->vfp.vreg[src2].s8[j]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = avg_round_s16(env,
+ env->vfp.vreg[src1].s16[j], env->vfp.vreg[src2].s16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = avg_round_s32(env,
+ env->vfp.vreg[src1].s32[j], env->vfp.vreg[src2].s32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = avg_round_s64(env,
+ env->vfp.vreg[src1].s64[j], env->vfp.vreg[src2].s64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vaadd.vx vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vaadd_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = avg_round_s8(env,
+ env->gpr[rs1], env->vfp.vreg[src2].s8[j]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = avg_round_s16(env,
+ env->gpr[rs1], env->vfp.vreg[src2].s16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = avg_round_s32(env,
+ env->gpr[rs1], env->vfp.vreg[src2].s32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = avg_round_s64(env,
+ env->gpr[rs1], env->vfp.vreg[src2].s64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vaadd.vi vd, vs2, imm, vm # vector-immediate */
+void VECTOR_HELPER(vaadd_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = avg_round_s8(env,
+ rs1, env->vfp.vreg[src2].s8[j]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = avg_round_s16(env,
+ rs1, env->vfp.vreg[src2].s16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = avg_round_s32(env,
+ rs1, env->vfp.vreg[src2].s32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = avg_round_s64(env,
+ rs1, env->vfp.vreg[src2].s64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vasub.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vasub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = avg_round_s8(
+ env,
+ ~env->vfp.vreg[src1].s8[j] + 1,
+ env->vfp.vreg[src2].s8[j]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = avg_round_s16(
+ env,
+ ~env->vfp.vreg[src1].s16[j] + 1,
+ env->vfp.vreg[src2].s16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = avg_round_s32(
+ env,
+ ~env->vfp.vreg[src1].s32[j] + 1,
+ env->vfp.vreg[src2].s32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = avg_round_s64(
+ env,
+ ~env->vfp.vreg[src1].s64[j] + 1,
+ env->vfp.vreg[src2].s64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ return;
+
+ env->vfp.vstart = 0;
+}
+
+/* vasub.vx vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vasub_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = avg_round_s8(
+ env, ~env->gpr[rs1] + 1, env->vfp.vreg[src2].s8[j]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = avg_round_s16(
+ env, ~env->gpr[rs1] + 1, env->vfp.vreg[src2].s16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = avg_round_s32(
+ env, ~env->gpr[rs1] + 1, env->vfp.vreg[src2].s32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = avg_round_s64(
+ env, ~env->gpr[rs1] + 1, env->vfp.vreg[src2].s64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vsmul.vv vd, vs2, vs1, vm # vd[i] = clip((vs2[i]*vs1[i]+round)>>(SEW-1)) */
+void VECTOR_HELPER(vsmul_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if ((!(vm)) && rd == 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = vsmul_8(env,
+ env->vfp.vreg[src1].s8[j], env->vfp.vreg[src2].s8[j]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = vsmul_16(env,
+ env->vfp.vreg[src1].s16[j], env->vfp.vreg[src2].s16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = vsmul_32(env,
+ env->vfp.vreg[src1].s32[j], env->vfp.vreg[src2].s32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = vsmul_64(env,
+ env->vfp.vreg[src1].s64[j], env->vfp.vreg[src2].s64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vsmul.vx vd, vs2, rs1, vm # vd[i] = clip((vs2[i]*x[rs1]+round)>>(SEW-1)) */
+void VECTOR_HELPER(vsmul_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if ((!(vm)) && rd == 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = vsmul_8(env,
+ env->vfp.vreg[src2].s8[j], env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = vsmul_16(env,
+ env->vfp.vreg[src2].s16[j], env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = vsmul_32(env,
+ env->vfp.vreg[src2].s32[j], env->gpr[rs1]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = vsmul_64(env,
+ env->vfp.vreg[src2].s64[j], env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/*
+ * vwsmaccu.vv vd, vs1, vs2, vm #
+ * vd[i] = clipu((+(vs1[i]*vs2[i]+round)>>SEW/2)+vd[i])
+ */
+void VECTOR_HELPER(vwsmaccu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] = vwsmaccu_8(env,
+ env->vfp.vreg[src2].u8[j],
+ env->vfp.vreg[src1].u8[j],
+ env->vfp.vreg[dest].u16[k]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] = vwsmaccu_16(env,
+ env->vfp.vreg[src2].u16[j],
+ env->vfp.vreg[src1].u16[j],
+ env->vfp.vreg[dest].u32[k]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] = vwsmaccu_32(env,
+ env->vfp.vreg[src2].u32[j],
+ env->vfp.vreg[src1].u32[j],
+ env->vfp.vreg[dest].u64[k]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/*
+ * vwsmaccu.vx vd, rs1, vs2, vm #
+ * vd[i] = clipu((+(x[rs1]*vs2[i]+round)>>SEW/2)+vd[i])
+ */
+void VECTOR_HELPER(vwsmaccu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] = vwsmaccu_8(env,
+ env->vfp.vreg[src2].u8[j],
+ env->gpr[rs1],
+ env->vfp.vreg[dest].u16[k]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] = vwsmaccu_16(env,
+ env->vfp.vreg[src2].u16[j],
+ env->gpr[rs1],
+ env->vfp.vreg[dest].u32[k]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] = vwsmaccu_32(env,
+ env->vfp.vreg[src2].u32[j],
+ env->gpr[rs1],
+ env->vfp.vreg[dest].u64[k]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/*
+ * vwsmacc.vv vd, vs1, vs2, vm #
+ * vd[i] = clip((+(vs1[i]*vs2[i]+round)>>SEW/2)+vd[i])
+ */
+void VECTOR_HELPER(vwsmacc_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] = vwsmacc_8(env,
+ env->vfp.vreg[src2].s8[j],
+ env->vfp.vreg[src1].s8[j],
+ env->vfp.vreg[dest].s16[k]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] = vwsmacc_16(env,
+ env->vfp.vreg[src2].s16[j],
+ env->vfp.vreg[src1].s16[j],
+ env->vfp.vreg[dest].s32[k]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] = vwsmacc_32(env,
+ env->vfp.vreg[src2].s32[j],
+ env->vfp.vreg[src1].s32[j],
+ env->vfp.vreg[dest].s64[k]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/*
+ * vwsmacc.vx vd, rs1, vs2, vm #
+ * vd[i] = clip((+(x[rs1]*vs2[i]+round)>>SEW/2)+vd[i])
+ */
+void VECTOR_HELPER(vwsmacc_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] = vwsmacc_8(env,
+ env->vfp.vreg[src2].s8[j],
+ env->gpr[rs1],
+ env->vfp.vreg[dest].s16[k]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] = vwsmacc_16(env,
+ env->vfp.vreg[src2].s16[j],
+ env->gpr[rs1],
+ env->vfp.vreg[dest].s32[k]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] = vwsmacc_32(env,
+ env->vfp.vreg[src2].s32[j],
+ env->gpr[rs1],
+ env->vfp.vreg[dest].s64[k]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/*
+ * vwsmaccsu.vv vd, vs1, vs2, vm
+ * # vd[i] = clip(-((signed(vs1[i])*unsigned(vs2[i])+round)>>SEW/2)+vd[i])
+ */
+void VECTOR_HELPER(vwsmaccsu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] = vwsmaccsu_8(env,
+ env->vfp.vreg[src2].u8[j],
+ env->vfp.vreg[src1].s8[j],
+ env->vfp.vreg[dest].s16[k]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] = vwsmaccsu_16(env,
+ env->vfp.vreg[src2].u16[j],
+ env->vfp.vreg[src1].s16[j],
+ env->vfp.vreg[dest].s32[k]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] = vwsmaccsu_32(env,
+ env->vfp.vreg[src2].u32[j],
+ env->vfp.vreg[src1].s32[j],
+ env->vfp.vreg[dest].s64[k]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/*
+ * vwsmaccsu.vx vd, rs1, vs2, vm
+ * # vd[i] = clip(-((signed(x[rs1])*unsigned(vs2[i])+round)>>SEW/2)+vd[i])
+ */
+void VECTOR_HELPER(vwsmaccsu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] = vwsmaccsu_8(env,
+ env->vfp.vreg[src2].u8[j],
+ env->gpr[rs1],
+ env->vfp.vreg[dest].s16[k]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] = vwsmaccsu_16(env,
+ env->vfp.vreg[src2].u16[j],
+ env->gpr[rs1],
+ env->vfp.vreg[dest].s32[k]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] = vwsmaccsu_32(env,
+ env->vfp.vreg[src2].u32[j],
+ env->gpr[rs1],
+ env->vfp.vreg[dest].s64[k]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/*
+ * vwsmaccus.vx vd, rs1, vs2, vm
+ * # vd[i] = clip(-((unsigned(x[rs1])*signed(vs2[i])+round)>>SEW/2)+vd[i])
+ */
+void VECTOR_HELPER(vwsmaccus_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] = vwsmaccus_8(env,
+ env->vfp.vreg[src2].s8[j],
+ env->gpr[rs1],
+ env->vfp.vreg[dest].s16[k]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] = vwsmaccus_16(env,
+ env->vfp.vreg[src2].s16[j],
+ env->gpr[rs1],
+ env->vfp.vreg[dest].s32[k]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] = vwsmaccus_32(env,
+ env->vfp.vreg[src2].s32[j],
+ env->gpr[rs1],
+ env->vfp.vreg[dest].s64[k]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vssrl.vv vd, vs2, vs1, vm # vd[i] = ((vs2[i] + round)>>vs1[i] */
+void VECTOR_HELPER(vssrl_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = vssrl_8(env,
+ env->vfp.vreg[src2].u8[j], env->vfp.vreg[src1].u8[j]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = vssrl_16(env,
+ env->vfp.vreg[src2].u16[j], env->vfp.vreg[src1].u16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = vssrl_32(env,
+ env->vfp.vreg[src2].u32[j], env->vfp.vreg[src1].u32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = vssrl_64(env,
+ env->vfp.vreg[src2].u64[j], env->vfp.vreg[src1].u64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vssrl.vx vd, vs2, rs1, vm # vd[i] = ((vs2[i] + round)>>x[rs1]) */
+void VECTOR_HELPER(vssrl_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = vssrl_8(env,
+ env->vfp.vreg[src2].u8[j], env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = vssrl_16(env,
+ env->vfp.vreg[src2].u16[j], env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = vssrl_32(env,
+ env->vfp.vreg[src2].u32[j], env->gpr[rs1]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = vssrl_64(env,
+ env->vfp.vreg[src2].u64[j], env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vssrl.vi vd, vs2, imm, vm # vd[i] = ((vs2[i] + round)>>imm) */
+void VECTOR_HELPER(vssrl_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = vssrli_8(env,
+ env->vfp.vreg[src2].u8[j], rs1);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = vssrli_16(env,
+ env->vfp.vreg[src2].u16[j], rs1);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = vssrli_32(env,
+ env->vfp.vreg[src2].u32[j], rs1);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = vssrli_64(env,
+ env->vfp.vreg[src2].u64[j], rs1);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vssra.vv vd, vs2, vs1, vm # vd[i] = ((vs2[i] + round)>>vs1[i]) */
+void VECTOR_HELPER(vssra_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = vssra_8(env,
+ env->vfp.vreg[src2].s8[j], env->vfp.vreg[src1].u8[j]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = vssra_16(env,
+ env->vfp.vreg[src2].s16[j], env->vfp.vreg[src1].u16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = vssra_32(env,
+ env->vfp.vreg[src2].s32[j], env->vfp.vreg[src1].u32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = vssra_64(env,
+ env->vfp.vreg[src2].s64[j], env->vfp.vreg[src1].u64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vssra.vx vd, vs2, rs1, vm # vd[i] = ((vs2[i] + round)>>x[rs1]) */
+void VECTOR_HELPER(vssra_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = vssra_8(env,
+ env->vfp.vreg[src2].s8[j], env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = vssra_16(env,
+ env->vfp.vreg[src2].s16[j], env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = vssra_32(env,
+ env->vfp.vreg[src2].s32[j], env->gpr[rs1]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = vssra_64(env,
+ env->vfp.vreg[src2].s64[j], env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vssra.vi vd, vs2, imm, vm # vd[i] = ((vs2[i] + round)>>imm) */
+void VECTOR_HELPER(vssra_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[j] = vssrai_8(env,
+ env->vfp.vreg[src2].s8[j], rs1);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = vssrai_16(env,
+ env->vfp.vreg[src2].s16[j], rs1);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = vssrai_32(env,
+ env->vfp.vreg[src2].s32[j], rs1);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = vssrai_64(env,
+ env->vfp.vreg[src2].s64[j], rs1);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vnclipu.vv vd, vs2, vs1, vm # vector-vector */
+void VECTOR_HELPER(vnclipu_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, k, src1, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)
+ || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / (2 * width));
+ k = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[k] = vnclipu_16(env,
+ env->vfp.vreg[src2].u16[j], env->vfp.vreg[src1].u8[k]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] = vnclipu_32(env,
+ env->vfp.vreg[src2].u32[j], env->vfp.vreg[src1].u16[k]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] = vnclipu_64(env,
+ env->vfp.vreg[src2].u64[j], env->vfp.vreg[src1].u32[k]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_narrow(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vnclipu.vx vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vnclipu_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)
+ || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / (2 * width));
+ k = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[k] = vnclipu_16(env,
+ env->vfp.vreg[src2].u16[j], env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] = vnclipu_32(env,
+ env->vfp.vreg[src2].u32[j], env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] = vnclipu_64(env,
+ env->vfp.vreg[src2].u64[j], env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_narrow(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vnclipu.vi vd, vs2, imm, vm # vector-immediate */
+void VECTOR_HELPER(vnclipu_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)
+ || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / (2 * width));
+ k = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[k] = vnclipui_16(env,
+ env->vfp.vreg[src2].u16[j], rs1);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] = vnclipui_32(env,
+ env->vfp.vreg[src2].u32[j], rs1);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] = vnclipui_64(env,
+ env->vfp.vreg[src2].u64[j], rs1);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_narrow(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vnclip.vv vd, vs2, vs1, vm # vector-vector */
+void VECTOR_HELPER(vnclip_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, k, src1, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)
+ || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / (2 * width));
+ k = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[k] = vnclip_16(env,
+ env->vfp.vreg[src2].s16[j], env->vfp.vreg[src1].u8[k]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] = vnclip_32(env,
+ env->vfp.vreg[src2].s32[j], env->vfp.vreg[src1].u16[k]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] = vnclip_64(env,
+ env->vfp.vreg[src2].s64[j], env->vfp.vreg[src1].u32[k]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_narrow(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vnclip.vx vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vnclip_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, k, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)
+ || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / (2 * width));
+ k = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[k] = vnclip_16(env,
+ env->vfp.vreg[src2].s16[j], env->gpr[rs1]);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] = vnclip_32(env,
+ env->vfp.vreg[src2].s32[j], env->gpr[rs1]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] = vnclip_64(env,
+ env->vfp.vreg[src2].s64[j], env->gpr[rs1]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_narrow(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vnclip.vi vd, vs2, imm, vm # vector-immediate */
+void VECTOR_HELPER(vnclip_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, k, src2;
+
+ lmul = vector_get_lmul(env);
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)
+ || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ j = i % (VLEN / (2 * width));
+ k = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s8[k] = vnclipi_16(env,
+ env->vfp.vreg[src2].s16[j], rs1);
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] = vnclipi_32(env,
+ env->vfp.vreg[src2].s32[j], rs1);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] = vnclipi_64(env,
+ env->vfp.vreg[src2].s64[j], rs1);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_narrow(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 13/17] RISC-V: add vector extension float instruction part1, add/sub/mul/div
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (11 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 12/17] RISC-V: add vector extension fixed point instructions liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 14/17] RISC-V: add vector extension float instructions part2, sqrt/cmp/cvt/others liuzhiwei
` (4 subsequent siblings)
17 siblings, 0 replies; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/helper.h | 37 +
target/riscv/insn32.decode | 37 +
target/riscv/insn_trans/trans_rvv.inc.c | 37 +
target/riscv/vector_helper.c | 2645 +++++++++++++++++++++++++++++++
4 files changed, 2756 insertions(+)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index ff6002e..d2c8684 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -307,5 +307,42 @@ DEF_HELPER_5(vector_vnclip_vv, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vnclip_vx, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vnclip_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfadd_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfadd_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfsub_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfsub_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfrsub_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfwadd_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfwadd_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfwadd_wv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfwadd_wf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfwsub_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfwsub_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfwsub_wv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfwsub_wf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmul_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmul_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfdiv_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfdiv_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfrdiv_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfwmul_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfwmul_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmacc_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmacc_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfnmacc_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfnmacc_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmsac_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmsac_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfnmsac_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfnmsac_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmadd_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmadd_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfnmadd_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfnmadd_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmsub_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmsub_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfnmsub_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfnmsub_vf, void, env, i32, i32, i32, i32)
+
DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index a82e53e..31868ab 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -447,5 +447,42 @@ vnclip_vv 101111 . ..... ..... 000 ..... 1010111 @r_vm
vnclip_vx 101111 . ..... ..... 100 ..... 1010111 @r_vm
vnclip_vi 101111 . ..... ..... 011 ..... 1010111 @r_vm
+vfadd_vv 000000 . ..... ..... 001 ..... 1010111 @r_vm
+vfadd_vf 000000 . ..... ..... 101 ..... 1010111 @r_vm
+vfsub_vv 000010 . ..... ..... 001 ..... 1010111 @r_vm
+vfsub_vf 000010 . ..... ..... 101 ..... 1010111 @r_vm
+vfrsub_vf 100111 . ..... ..... 101 ..... 1010111 @r_vm
+vfwadd_vv 110000 . ..... ..... 001 ..... 1010111 @r_vm
+vfwadd_vf 110000 . ..... ..... 101 ..... 1010111 @r_vm
+vfwadd_wv 110100 . ..... ..... 001 ..... 1010111 @r_vm
+vfwadd_wf 110100 . ..... ..... 101 ..... 1010111 @r_vm
+vfwsub_vv 110010 . ..... ..... 001 ..... 1010111 @r_vm
+vfwsub_vf 110010 . ..... ..... 101 ..... 1010111 @r_vm
+vfwsub_wv 110110 . ..... ..... 001 ..... 1010111 @r_vm
+vfwsub_wf 110110 . ..... ..... 101 ..... 1010111 @r_vm
+vfmul_vv 100100 . ..... ..... 001 ..... 1010111 @r_vm
+vfmul_vf 100100 . ..... ..... 101 ..... 1010111 @r_vm
+vfdiv_vv 100000 . ..... ..... 001 ..... 1010111 @r_vm
+vfdiv_vf 100000 . ..... ..... 101 ..... 1010111 @r_vm
+vfrdiv_vf 100001 . ..... ..... 101 ..... 1010111 @r_vm
+vfwmul_vv 111000 . ..... ..... 001 ..... 1010111 @r_vm
+vfwmul_vf 111000 . ..... ..... 101 ..... 1010111 @r_vm
+vfmacc_vf 101100 . ..... ..... 101 ..... 1010111 @r_vm
+vfmacc_vv 101100 . ..... ..... 001 ..... 1010111 @r_vm
+vfnmacc_vv 101101 . ..... ..... 001 ..... 1010111 @r_vm
+vfnmacc_vf 101101 . ..... ..... 101 ..... 1010111 @r_vm
+vfmsac_vv 101110 . ..... ..... 001 ..... 1010111 @r_vm
+vfmsac_vf 101110 . ..... ..... 101 ..... 1010111 @r_vm
+vfnmsac_vv 101111 . ..... ..... 001 ..... 1010111 @r_vm
+vfnmsac_vf 101111 . ..... ..... 101 ..... 1010111 @r_vm
+vfmadd_vv 101000 . ..... ..... 001 ..... 1010111 @r_vm
+vfmadd_vf 101000 . ..... ..... 101 ..... 1010111 @r_vm
+vfnmadd_vv 101001 . ..... ..... 001 ..... 1010111 @r_vm
+vfnmadd_vf 101001 . ..... ..... 101 ..... 1010111 @r_vm
+vfmsub_vv 101010 . ..... ..... 001 ..... 1010111 @r_vm
+vfmsub_vf 101010 . ..... ..... 101 ..... 1010111 @r_vm
+vfnmsub_vv 101011 . ..... ..... 001 ..... 1010111 @r_vm
+vfnmsub_vf 101011 . ..... ..... 101 ..... 1010111 @r_vm
+
vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index d650e8c..ff23bc2 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -336,5 +336,42 @@ GEN_VECTOR_R_VM(vnclip_vv)
GEN_VECTOR_R_VM(vnclip_vx)
GEN_VECTOR_R_VM(vnclip_vi)
+GEN_VECTOR_R_VM(vfadd_vv)
+GEN_VECTOR_R_VM(vfadd_vf)
+GEN_VECTOR_R_VM(vfsub_vv)
+GEN_VECTOR_R_VM(vfsub_vf)
+GEN_VECTOR_R_VM(vfrsub_vf)
+GEN_VECTOR_R_VM(vfwadd_vv)
+GEN_VECTOR_R_VM(vfwadd_vf)
+GEN_VECTOR_R_VM(vfwadd_wv)
+GEN_VECTOR_R_VM(vfwadd_wf)
+GEN_VECTOR_R_VM(vfwsub_wv)
+GEN_VECTOR_R_VM(vfwsub_wf)
+GEN_VECTOR_R_VM(vfwsub_vv)
+GEN_VECTOR_R_VM(vfwsub_vf)
+GEN_VECTOR_R_VM(vfmul_vv)
+GEN_VECTOR_R_VM(vfmul_vf)
+GEN_VECTOR_R_VM(vfdiv_vv)
+GEN_VECTOR_R_VM(vfdiv_vf)
+GEN_VECTOR_R_VM(vfrdiv_vf)
+GEN_VECTOR_R_VM(vfwmul_vv)
+GEN_VECTOR_R_VM(vfwmul_vf)
+GEN_VECTOR_R_VM(vfmacc_vv)
+GEN_VECTOR_R_VM(vfmacc_vf)
+GEN_VECTOR_R_VM(vfnmacc_vv)
+GEN_VECTOR_R_VM(vfnmacc_vf)
+GEN_VECTOR_R_VM(vfmsac_vv)
+GEN_VECTOR_R_VM(vfmsac_vf)
+GEN_VECTOR_R_VM(vfnmsac_vv)
+GEN_VECTOR_R_VM(vfnmsac_vf)
+GEN_VECTOR_R_VM(vfmadd_vv)
+GEN_VECTOR_R_VM(vfmadd_vf)
+GEN_VECTOR_R_VM(vfnmadd_vv)
+GEN_VECTOR_R_VM(vfnmadd_vf)
+GEN_VECTOR_R_VM(vfmsub_vv)
+GEN_VECTOR_R_VM(vfmsub_vf)
+GEN_VECTOR_R_VM(vfnmsub_vv)
+GEN_VECTOR_R_VM(vfnmsub_vf)
+
GEN_VECTOR_R2_ZIMM(vsetvli)
GEN_VECTOR_R(vsetvl)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 2292fa5..e16543b 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -21,6 +21,7 @@
#include "exec/exec-all.h"
#include "exec/helper-proto.h"
#include "exec/cpu_ldst.h"
+#include "fpu/softfloat.h"
#include <math.h>
#define VECTOR_HELPER(name) HELPER(glue(vector_, name))
@@ -1125,6 +1126,41 @@ static void vector_tail_narrow(CPURISCVState *env, int vreg, int index,
}
}
+static void vector_tail_fcommon(CPURISCVState *env, int vreg, int index,
+ int width)
+{
+ switch (width) {
+ case 16:
+ env->vfp.vreg[vreg].u16[index] = 0;
+ break;
+ case 32:
+ env->vfp.vreg[vreg].u32[index] = 0;
+ break;
+ case 64:
+ env->vfp.vreg[vreg].u64[index] = 0;
+ break;
+ default:
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ return;
+ }
+}
+
+static void vector_tail_fwiden(CPURISCVState *env, int vreg, int index,
+ int width)
+{
+ switch (width) {
+ case 16:
+ env->vfp.vreg[vreg].u32[index] = 0;
+ break;
+ case 32:
+ env->vfp.vreg[vreg].u64[index] = 0;
+ break;
+ default:
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ return;
+ }
+}
+
static inline int vector_get_carry(CPURISCVState *env, int width, int lmul,
int index)
{
@@ -17114,3 +17150,2612 @@ void VECTOR_HELPER(vnclip_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
env->vfp.vstart = 0;
return;
}
+
+/* vfadd.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vfadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_add(
+ env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_add(
+ env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_add(
+ env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfadd.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vfadd_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_add(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_add(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_add(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfsub.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vfsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_sub(
+ env->vfp.vreg[src2].f16[j],
+ env->vfp.vreg[src1].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_sub(
+ env->vfp.vreg[src2].f32[j],
+ env->vfp.vreg[src1].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_sub(
+ env->vfp.vreg[src2].f64[j],
+ env->vfp.vreg[src1].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfsub.vf vd, vs2, rs1, vm # Vector-scalar vd[i] = vs2[i] - f[rs1] */
+void VECTOR_HELPER(vfsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_sub(
+ env->vfp.vreg[src2].f16[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_sub(
+ env->vfp.vreg[src2].f32[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_sub(
+ env->vfp.vreg[src2].f64[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfrsub.vf vd, vs2, rs1, vm # Scalar-vector vd[i] = f[rs1] - vs2[i] */
+void VECTOR_HELPER(vfrsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_sub(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_sub(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_sub(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfwadd.vv vd, vs2, vs1, vm # vector-vector */
+void VECTOR_HELPER(vfwadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = float32_add(
+ float16_to_float32(env->vfp.vreg[src2].f16[j], true,
+ &env->fp_status),
+ float16_to_float32(env->vfp.vreg[src1].f16[j], true,
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[k] = float64_add(
+ float32_to_float64(env->vfp.vreg[src2].f32[j],
+ &env->fp_status),
+ float32_to_float64(env->vfp.vreg[src1].f32[j],
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fwiden(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfwadd.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vfwadd_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = float32_add(
+ float16_to_float32(env->vfp.vreg[src2].f16[j], true,
+ &env->fp_status),
+ float16_to_float32(env->fpr[rs1], true,
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[k] = float64_add(
+ float32_to_float64(env->vfp.vreg[src2].f32[j],
+ &env->fp_status),
+ float32_to_float64(env->fpr[rs1], &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fwiden(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfwadd.wv vd, vs2, vs1, vm # vector-vector */
+void VECTOR_HELPER(vfwadd_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = float32_add(
+ env->vfp.vreg[src2].f32[k],
+ float16_to_float32(env->vfp.vreg[src1].f16[j], true,
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[k] = float64_add(
+ env->vfp.vreg[src2].f64[k],
+ float32_to_float64(env->vfp.vreg[src1].f32[j],
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfwadd.wf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vfwadd_wf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = float32_add(
+ env->vfp.vreg[src2].f32[k],
+ float16_to_float32(env->fpr[rs1], true,
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[k] = float64_add(
+ env->vfp.vreg[src2].f64[k],
+ float32_to_float64(env->fpr[rs1], &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_widen(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfwsub.vv vd, vs2, vs1, vm # vector-vector */
+void VECTOR_HELPER(vfwsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = float32_sub(
+ float16_to_float32(env->vfp.vreg[src2].f16[j], true,
+ &env->fp_status),
+ float16_to_float32(env->vfp.vreg[src1].f16[j], true,
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[k] = float64_sub(
+ float32_to_float64(env->vfp.vreg[src2].f32[j],
+ &env->fp_status),
+ float32_to_float64(env->vfp.vreg[src1].f32[j],
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fwiden(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfwsub.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vfwsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = float32_sub(
+ float16_to_float32(env->vfp.vreg[src2].f16[j], true,
+ &env->fp_status),
+ float16_to_float32(env->fpr[rs1], true,
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[k] = float64_sub(
+ float32_to_float64(env->vfp.vreg[src2].f32[j],
+ &env->fp_status),
+ float32_to_float64(env->fpr[rs1], &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fwiden(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfwsub.wv vd, vs2, vs1, vm # vector-vector */
+void VECTOR_HELPER(vfwsub_wv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = float32_sub(
+ env->vfp.vreg[src2].f32[k],
+ float16_to_float32(env->vfp.vreg[src1].f16[j], true,
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[k] = float64_sub(
+ env->vfp.vreg[src2].f64[k],
+ float32_to_float64(env->vfp.vreg[src1].f32[j],
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fwiden(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfwsub.wf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vfwsub_wf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = float32_sub(
+ env->vfp.vreg[src2].f32[k],
+ float16_to_float32(env->fpr[rs1], true,
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[k] = float64_sub(
+ env->vfp.vreg[src2].f64[k],
+ float32_to_float64(env->fpr[rs1], &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fwiden(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmul.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vfmul_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_mul(
+ env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_mul(
+ env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_mul(
+ env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmul.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vfmul_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_mul(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_mul(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_mul(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfdiv.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vfdiv_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_div(
+ env->vfp.vreg[src2].f16[j],
+ env->vfp.vreg[src1].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_div(
+ env->vfp.vreg[src2].f32[j],
+ env->vfp.vreg[src1].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_div(
+ env->vfp.vreg[src2].f64[j],
+ env->vfp.vreg[src1].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfdiv.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vfdiv_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_div(
+ env->vfp.vreg[src2].f16[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_div(
+ env->vfp.vreg[src2].f32[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_div(
+ env->vfp.vreg[src2].f64[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfrdiv.vf vd, vs2, rs1, vm # scalar-vector, vd[i] = f[rs1]/vs2[i] */
+void VECTOR_HELPER(vfrdiv_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_div(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_div(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_div(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfwmul.vv vd, vs2, vs1, vm # vector-vector */
+void VECTOR_HELPER(vfwmul_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = float32_mul(
+ float16_to_float32(env->vfp.vreg[src2].f16[j], true,
+ &env->fp_status),
+ float16_to_float32(env->vfp.vreg[src1].f16[j], true,
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[k] = float64_mul(
+ float32_to_float64(env->vfp.vreg[src2].f32[j],
+ &env->fp_status),
+ float32_to_float64(env->vfp.vreg[src1].f32[j],
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fwiden(env, dest, k, width);
+ }
+ }
+ return;
+
+ env->vfp.vstart = 0;
+}
+
+/* vfwmul.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vfwmul_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = float32_mul(
+ float16_to_float32(env->vfp.vreg[src2].f16[j], true,
+ &env->fp_status),
+ float16_to_float32(env->fpr[rs1], true,
+ &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[k] = float64_mul(
+ float32_to_float64(env->vfp.vreg[src2].f32[j],
+ &env->fp_status),
+ float32_to_float64(env->fpr[rs1], &env->fp_status),
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fwiden(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmacc.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) + vd[i] */
+void VECTOR_HELPER(vfmacc_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ env->vfp.vreg[dest].f16[j],
+ 0,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ env->vfp.vreg[dest].f32[j],
+ 0,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ env->vfp.vreg[dest].f64[j],
+ 0,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmacc.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) + vd[i] */
+void VECTOR_HELPER(vfmacc_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f16[j],
+ env->vfp.vreg[dest].f16[j],
+ 0,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f32[j],
+ env->vfp.vreg[dest].f32[j],
+ 0,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f64[j],
+ env->vfp.vreg[dest].f64[j],
+ 0,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfnmacc.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) - vd[i] */
+void VECTOR_HELPER(vfnmacc_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ env->vfp.vreg[dest].f16[j],
+ float_muladd_negate_c |
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ env->vfp.vreg[dest].f32[j],
+ float_muladd_negate_c |
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ env->vfp.vreg[dest].f64[j],
+ float_muladd_negate_c |
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfnmacc.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) - vd[i] */
+void VECTOR_HELPER(vfnmacc_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f16[j],
+ env->vfp.vreg[dest].f16[j],
+ float_muladd_negate_c |
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f32[j],
+ env->vfp.vreg[dest].f32[j],
+ float_muladd_negate_c |
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f64[j],
+ env->vfp.vreg[dest].f64[j],
+ float_muladd_negate_c |
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmsac.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vs2[i]) - vd[i] */
+void VECTOR_HELPER(vfmsac_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ env->vfp.vreg[dest].f16[j],
+ float_muladd_negate_c,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ env->vfp.vreg[dest].f32[j],
+ float_muladd_negate_c,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ env->vfp.vreg[dest].f64[j],
+ float_muladd_negate_c,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmsac.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vs2[i]) - vd[i] */
+void VECTOR_HELPER(vfmsac_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f16[j],
+ env->vfp.vreg[dest].f16[j],
+ float_muladd_negate_c,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f32[j],
+ env->vfp.vreg[dest].f32[j],
+ float_muladd_negate_c,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f64[j],
+ env->vfp.vreg[dest].f64[j],
+ float_muladd_negate_c,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ return;
+
+ env->vfp.vstart = 0;
+}
+
+/* vfnmsac.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vs2[i]) + vd[i] */
+void VECTOR_HELPER(vfnmsac_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ env->vfp.vreg[dest].f16[j],
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ env->vfp.vreg[dest].f32[j],
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ env->vfp.vreg[dest].f64[j],
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfnmsac.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vs2[i]) + vd[i] */
+void VECTOR_HELPER(vfnmsac_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f16[j],
+ env->vfp.vreg[dest].f16[j],
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f32[j],
+ env->vfp.vreg[dest].f32[j],
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f64[j],
+ env->vfp.vreg[dest].f64[j],
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmadd.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vd[i]) + vs2[i] */
+void VECTOR_HELPER(vfmadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[dest].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ 0,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[dest].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ 0,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[dest].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ 0,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmadd.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vd[i]) + vs2[i] */
+void VECTOR_HELPER(vfmadd_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[dest].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ 0,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[dest].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ 0,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[dest].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ 0,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+/* vfnmadd.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vd[i]) - vs2[i] */
+void VECTOR_HELPER(vfnmadd_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[dest].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ float_muladd_negate_c |
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[dest].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ float_muladd_negate_c |
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[dest].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ float_muladd_negate_c |
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfnmadd.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vd[i]) - vs2[i] */
+void VECTOR_HELPER(vfnmadd_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[dest].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ float_muladd_negate_c |
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[dest].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ float_muladd_negate_c |
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[dest].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ float_muladd_negate_c |
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmsub.vv vd, vs1, vs2, vm # vd[i] = +(vs1[i] * vd[i]) - vs2[i] */
+void VECTOR_HELPER(vfmsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[dest].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ float_muladd_negate_c,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[dest].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ float_muladd_negate_c,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[dest].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ float_muladd_negate_c,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmsub.vf vd, rs1, vs2, vm # vd[i] = +(f[rs1] * vd[i]) - vs2[i] */
+void VECTOR_HELPER(vfmsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[dest].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ float_muladd_negate_c,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[dest].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ float_muladd_negate_c,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[dest].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ float_muladd_negate_c,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+/* vfnmsub.vv vd, vs1, vs2, vm # vd[i] = -(vs1[i] * vd[i]) + vs2[i] */
+void VECTOR_HELPER(vfnmsub_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[dest].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[dest].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[dest].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ return;
+
+ env->vfp.vstart = 0;
+}
+
+/* vfnmsub.vf vd, rs1, vs2, vm # vd[i] = -(f[rs1] * vd[i]) + vs2[i] */
+void VECTOR_HELPER(vfnmsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[dest].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[dest].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_muladd(
+ env->fpr[rs1],
+ env->vfp.vreg[dest].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ float_muladd_negate_product,
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 14/17] RISC-V: add vector extension float instructions part2, sqrt/cmp/cvt/others
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (12 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 13/17] RISC-V: add vector extension float instruction part1, add/sub/mul/div liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 15/17] RISC-V: add vector extension reduction instructions liuzhiwei
` (3 subsequent siblings)
17 siblings, 0 replies; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/helper.h | 40 +
target/riscv/insn32.decode | 40 +
target/riscv/insn_trans/trans_rvv.inc.c | 54 +
target/riscv/vector_helper.c | 2962 +++++++++++++++++++++++++++++++
4 files changed, 3096 insertions(+)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index d2c8684..e2384eb 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -344,5 +344,45 @@ DEF_HELPER_5(vector_vfmsub_vf, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vfnmsub_vv, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vfnmsub_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_4(vector_vfsqrt_v, void, env, i32, i32, i32)
+DEF_HELPER_5(vector_vfmin_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmin_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmax_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmax_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfsgnj_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfsgnj_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfsgnjn_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfsgnjn_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfsgnjx_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfsgnjx_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmfeq_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmfeq_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmfne_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmfne_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmfle_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmfle_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmflt_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmflt_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmfgt_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmfge_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmford_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vmford_vf, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfmerge_vfm, void, env, i32, i32, i32, i32)
+DEF_HELPER_4(vector_vfclass_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfcvt_xu_f_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfcvt_x_f_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfcvt_f_xu_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfcvt_f_x_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfwcvt_xu_f_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfwcvt_x_f_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfwcvt_f_xu_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfwcvt_f_x_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfwcvt_f_f_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfncvt_xu_f_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfncvt_x_f_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfncvt_f_xu_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfncvt_f_x_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfncvt_f_f_v, void, env, i32, i32, i32)
+
DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 31868ab..256d8ea 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -67,6 +67,7 @@
@r_wdvm ..... wd:1 vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd
@r_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd
@r2_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs1 %rd
+@r2_vm ...... vm:1 ..... ..... ... ..... ....... %rs2 %rd
@r2_zimm . zimm:11 ..... ... ..... ....... %rs1 %rd
@sfence_vma ....... ..... ..... ... ..... ....... %rs2 %rs1
@@ -483,6 +484,45 @@ vfmsub_vv 101010 . ..... ..... 001 ..... 1010111 @r_vm
vfmsub_vf 101010 . ..... ..... 101 ..... 1010111 @r_vm
vfnmsub_vv 101011 . ..... ..... 001 ..... 1010111 @r_vm
vfnmsub_vf 101011 . ..... ..... 101 ..... 1010111 @r_vm
+vfsqrt_v 100011 . ..... 00000 001 ..... 1010111 @r2_vm
+vfmin_vv 000100 . ..... ..... 001 ..... 1010111 @r_vm
+vfmin_vf 000100 . ..... ..... 101 ..... 1010111 @r_vm
+vfmax_vv 000110 . ..... ..... 001 ..... 1010111 @r_vm
+vfmax_vf 000110 . ..... ..... 101 ..... 1010111 @r_vm
+vfsgnj_vv 001000 . ..... ..... 001 ..... 1010111 @r_vm
+vfsgnj_vf 001000 . ..... ..... 101 ..... 1010111 @r_vm
+vfsgnjn_vv 001001 . ..... ..... 001 ..... 1010111 @r_vm
+vfsgnjn_vf 001001 . ..... ..... 101 ..... 1010111 @r_vm
+vfsgnjx_vv 001010 . ..... ..... 001 ..... 1010111 @r_vm
+vfsgnjx_vf 001010 . ..... ..... 101 ..... 1010111 @r_vm
+vmfeq_vv 011000 . ..... ..... 001 ..... 1010111 @r_vm
+vmfeq_vf 011000 . ..... ..... 101 ..... 1010111 @r_vm
+vmfne_vv 011100 . ..... ..... 001 ..... 1010111 @r_vm
+vmfne_vf 011100 . ..... ..... 101 ..... 1010111 @r_vm
+vmflt_vv 011011 . ..... ..... 001 ..... 1010111 @r_vm
+vmflt_vf 011011 . ..... ..... 101 ..... 1010111 @r_vm
+vmfle_vv 011001 . ..... ..... 001 ..... 1010111 @r_vm
+vmfle_vf 011001 . ..... ..... 101 ..... 1010111 @r_vm
+vmfgt_vf 011101 . ..... ..... 101 ..... 1010111 @r_vm
+vmfge_vf 011111 . ..... ..... 101 ..... 1010111 @r_vm
+vmford_vv 011010 . ..... ..... 001 ..... 1010111 @r_vm
+vmford_vf 011010 . ..... ..... 101 ..... 1010111 @r_vm
+vfclass_v 100011 . ..... 10000 001 ..... 1010111 @r2_vm
+vfmerge_vfm 010111 . ..... ..... 101 ..... 1010111 @r_vm
+vfcvt_xu_f_v 100010 . ..... 00000 001 ..... 1010111 @r2_vm
+vfcvt_x_f_v 100010 . ..... 00001 001 ..... 1010111 @r2_vm
+vfcvt_f_xu_v 100010 . ..... 00010 001 ..... 1010111 @r2_vm
+vfcvt_f_x_v 100010 . ..... 00011 001 ..... 1010111 @r2_vm
+vfwcvt_xu_f_v 100010 . ..... 01000 001 ..... 1010111 @r2_vm
+vfwcvt_x_f_v 100010 . ..... 01001 001 ..... 1010111 @r2_vm
+vfwcvt_f_xu_v 100010 . ..... 01010 001 ..... 1010111 @r2_vm
+vfwcvt_f_x_v 100010 . ..... 01011 001 ..... 1010111 @r2_vm
+vfwcvt_f_f_v 100010 . ..... 01100 001 ..... 1010111 @r2_vm
+vfncvt_xu_f_v 100010 . ..... 10000 001 ..... 1010111 @r2_vm
+vfncvt_x_f_v 100010 . ..... 10001 001 ..... 1010111 @r2_vm
+vfncvt_f_xu_v 100010 . ..... 10010 001 ..... 1010111 @r2_vm
+vfncvt_f_x_v 100010 . ..... 10011 001 ..... 1010111 @r2_vm
+vfncvt_f_f_v 100010 . ..... 10100 001 ..... 1010111 @r2_vm
vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index ff23bc2..e4d4576 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -92,6 +92,20 @@ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
return true; \
}
+#define GEN_VECTOR_R2_VM(INSN) \
+static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
+{ \
+ TCGv_i32 s2 = tcg_const_i32(a->rs2); \
+ TCGv_i32 d = tcg_const_i32(a->rd); \
+ TCGv_i32 vm = tcg_const_i32(a->vm); \
+ gen_helper_vector_##INSN(cpu_env, vm, s2, d); \
+ tcg_temp_free_i32(s2); \
+ tcg_temp_free_i32(d); \
+ tcg_temp_free_i32(vm); \
+ return true; \
+}
+
+
#define GEN_VECTOR_R2_ZIMM(INSN) \
static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
{ \
@@ -373,5 +387,45 @@ GEN_VECTOR_R_VM(vfmsub_vf)
GEN_VECTOR_R_VM(vfnmsub_vv)
GEN_VECTOR_R_VM(vfnmsub_vf)
+GEN_VECTOR_R2_VM(vfsqrt_v)
+GEN_VECTOR_R_VM(vfmin_vv)
+GEN_VECTOR_R_VM(vfmin_vf)
+GEN_VECTOR_R_VM(vfmax_vv)
+GEN_VECTOR_R_VM(vfmax_vf)
+GEN_VECTOR_R_VM(vfsgnj_vv)
+GEN_VECTOR_R_VM(vfsgnj_vf)
+GEN_VECTOR_R_VM(vfsgnjn_vv)
+GEN_VECTOR_R_VM(vfsgnjn_vf)
+GEN_VECTOR_R_VM(vfsgnjx_vv)
+GEN_VECTOR_R_VM(vfsgnjx_vf)
+GEN_VECTOR_R_VM(vmfeq_vv)
+GEN_VECTOR_R_VM(vmfeq_vf)
+GEN_VECTOR_R_VM(vmfne_vv)
+GEN_VECTOR_R_VM(vmfne_vf)
+GEN_VECTOR_R_VM(vmfle_vv)
+GEN_VECTOR_R_VM(vmfle_vf)
+GEN_VECTOR_R_VM(vmflt_vv)
+GEN_VECTOR_R_VM(vmflt_vf)
+GEN_VECTOR_R_VM(vmfgt_vf)
+GEN_VECTOR_R_VM(vmfge_vf)
+GEN_VECTOR_R_VM(vmford_vv)
+GEN_VECTOR_R_VM(vmford_vf)
+GEN_VECTOR_R2_VM(vfclass_v)
+GEN_VECTOR_R_VM(vfmerge_vfm)
+GEN_VECTOR_R2_VM(vfcvt_xu_f_v)
+GEN_VECTOR_R2_VM(vfcvt_x_f_v)
+GEN_VECTOR_R2_VM(vfcvt_f_xu_v)
+GEN_VECTOR_R2_VM(vfcvt_f_x_v)
+GEN_VECTOR_R2_VM(vfwcvt_xu_f_v)
+GEN_VECTOR_R2_VM(vfwcvt_x_f_v)
+GEN_VECTOR_R2_VM(vfwcvt_f_xu_v)
+GEN_VECTOR_R2_VM(vfwcvt_f_x_v)
+GEN_VECTOR_R2_VM(vfwcvt_f_f_v)
+GEN_VECTOR_R2_VM(vfncvt_xu_f_v)
+GEN_VECTOR_R2_VM(vfncvt_x_f_v)
+GEN_VECTOR_R2_VM(vfncvt_f_xu_v)
+GEN_VECTOR_R2_VM(vfncvt_f_x_v)
+GEN_VECTOR_R2_VM(vfncvt_f_f_v)
+
GEN_VECTOR_R2_ZIMM(vsetvli)
GEN_VECTOR_R(vsetvl)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index e16543b..fd2ecb7 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -914,6 +914,25 @@ static inline int64_t avg_round_s64(CPURISCVState *env, int64_t a, int64_t b)
return res;
}
+static target_ulong helper_fclass_h(uint64_t frs1)
+{
+ float16 f = frs1;
+ bool sign = float16_is_neg(f);
+
+ if (float16_is_infinity(f)) {
+ return sign ? 1 << 0 : 1 << 7;
+ } else if (float16_is_zero(f)) {
+ return sign ? 1 << 3 : 1 << 4;
+ } else if (float16_is_zero_or_denormal(f)) {
+ return sign ? 1 << 2 : 1 << 5;
+ } else if (float16_is_any_nan(f)) {
+ float_status s = { }; /* for snan_bit_is_one */
+ return float16_is_quiet_nan(f, &s) ? 1 << 9 : 1 << 8;
+ } else {
+ return sign ? 1 << 1 : 1 << 6;
+ }
+}
+
static inline bool vector_vtype_ill(CPURISCVState *env)
{
if ((env->vfp.vtype >> (sizeof(target_ulong) - 1)) & 0x1) {
@@ -1017,6 +1036,32 @@ static bool vector_lmul_check_reg(CPURISCVState *env, uint32_t lmul,
return true;
}
+/**
+ * deposit16:
+ * @value: initial value to insert bit field into
+ * @start: the lowest bit in the bit field (numbered from 0)
+ * @length: the length of the bit field
+ * @fieldval: the value to insert into the bit field
+ *
+ * Deposit @fieldval into the 16 bit @value at the bit field specified
+ * by the @start and @length parameters, and return the modified
+ * @value. Bits of @value outside the bit field are not modified.
+ * Bits of @fieldval above the least significant @length bits are
+ * ignored. The bit field must lie entirely within the 16 bit word.
+ * It is valid to request that all 16 bits are modified (ie @length
+ * 16 and @start 0).
+ *
+ * Returns: the modified @value.
+ */
+static inline uint16_t deposit16(uint16_t value, int start, int length,
+ uint16_t fieldval)
+{
+ uint16_t mask;
+ assert(start >= 0 && length > 0 && length <= 16 - start);
+ mask = (~0U >> (16 - length)) << start;
+ return (value & ~mask) | ((fieldval << start) & mask);
+}
+
static void vector_tail_amo(CPURISCVState *env, int vreg, int index, int width)
{
switch (width) {
@@ -1161,6 +1206,22 @@ static void vector_tail_fwiden(CPURISCVState *env, int vreg, int index,
}
}
+static void vector_tail_fnarrow(CPURISCVState *env, int vreg, int index,
+ int width)
+{
+ switch (width) {
+ case 16:
+ env->vfp.vreg[vreg].u16[index] = 0;
+ break;
+ case 32:
+ env->vfp.vreg[vreg].u32[index] = 0;
+ break;
+ default:
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ return;
+ }
+}
+
static inline int vector_get_carry(CPURISCVState *env, int width, int lmul,
int index)
{
@@ -19758,4 +19819,2905 @@ void VECTOR_HELPER(vfnmsub_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
return;
}
+/* vfsqrt.v vd, vs2, vm # Vector-vector square root */
+void VECTOR_HELPER(vfsqrt_v)(CPURISCVState *env, uint32_t vm, uint32_t rs2,
+ uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_sqrt(
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_sqrt(
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_sqrt(
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ env->vfp.vreg[dest].f16[j] = 0;
+ case 32:
+ env->vfp.vreg[dest].f32[j] = 0;
+ case 64:
+ env->vfp.vreg[dest].f64[j] = 0;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmin.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vfmin_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_minnum(
+ env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_minnum(
+ env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_minnum(
+ env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ env->vfp.vreg[dest].f16[j] = 0;
+ case 32:
+ env->vfp.vreg[dest].f32[j] = 0;
+ case 64:
+ env->vfp.vreg[dest].f64[j] = 0;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmin.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vfmin_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_minnum(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_minnum(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_minnum(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ env->vfp.vreg[dest].f16[j] = 0;
+ case 32:
+ env->vfp.vreg[dest].f32[j] = 0;
+ case 64:
+ env->vfp.vreg[dest].f64[j] = 0;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ return;
+
+ env->vfp.vstart = 0;
+}
+
+/*vfmax.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vfmax_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_maxnum(
+ env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_maxnum(
+ env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_maxnum(
+ env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ env->vfp.vreg[dest].f16[j] = 0;
+ case 32:
+ env->vfp.vreg[dest].f32[j] = 0;
+ case 64:
+ env->vfp.vreg[dest].f64[j] = 0;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmax.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vfmax_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = float16_maxnum(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = float32_maxnum(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = float64_maxnum(
+ env->fpr[rs1],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ env->vfp.vreg[dest].f16[j] = 0;
+ case 32:
+ env->vfp.vreg[dest].f32[j] = 0;
+ case 64:
+ env->vfp.vreg[dest].f64[j] = 0;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ return;
+
+ env->vfp.vstart = 0;
+}
+
+/* vfsgnj.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vfsgnj_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = deposit16(
+ env->vfp.vreg[src1].f16[j],
+ 0,
+ 15,
+ env->vfp.vreg[src2].f16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = deposit32(
+ env->vfp.vreg[src1].f32[j],
+ 0,
+ 31,
+ env->vfp.vreg[src2].f32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = deposit64(
+ env->vfp.vreg[src1].f64[j],
+ 0,
+ 63,
+ env->vfp.vreg[src2].f64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ env->vfp.vreg[dest].f16[j] = 0;
+ case 32:
+ env->vfp.vreg[dest].f32[j] = 0;
+ case 64:
+ env->vfp.vreg[dest].f64[j] = 0;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfsgnj.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vfsgnj_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = deposit16(
+ env->fpr[rs1],
+ 0,
+ 15,
+ env->vfp.vreg[src2].f16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = deposit32(
+ env->fpr[rs1],
+ 0,
+ 31,
+ env->vfp.vreg[src2].f32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = deposit64(
+ env->fpr[rs1],
+ 0,
+ 63,
+ env->vfp.vreg[src2].f64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ env->vfp.vreg[dest].f16[j] = 0;
+ case 32:
+ env->vfp.vreg[dest].f32[j] = 0;
+ case 64:
+ env->vfp.vreg[dest].f64[j] = 0;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfsgnjn.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vfsgnjn_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = deposit16(
+ ~env->vfp.vreg[src1].f16[j],
+ 0,
+ 15,
+ env->vfp.vreg[src2].f16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = deposit32(
+ ~env->vfp.vreg[src1].f32[j],
+ 0,
+ 31,
+ env->vfp.vreg[src2].f32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = deposit64(
+ ~env->vfp.vreg[src1].f64[j],
+ 0,
+ 63,
+ env->vfp.vreg[src2].f64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ env->vfp.vreg[dest].f16[j] = 0;
+ case 32:
+ env->vfp.vreg[dest].f32[j] = 0;
+ case 64:
+ env->vfp.vreg[dest].f64[j] = 0;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+/* vfsgnjn.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vfsgnjn_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = deposit16(
+ ~env->fpr[rs1],
+ 0,
+ 15,
+ env->vfp.vreg[src2].f16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = deposit32(
+ ~env->fpr[rs1],
+ 0,
+ 31,
+ env->vfp.vreg[src2].f32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = deposit64(
+ ~env->fpr[rs1],
+ 0,
+ 63,
+ env->vfp.vreg[src2].f64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ env->vfp.vreg[dest].f16[j] = 0;
+ case 32:
+ env->vfp.vreg[dest].f32[j] = 0;
+ case 64:
+ env->vfp.vreg[dest].f64[j] = 0;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfsgnjx.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vfsgnjx_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src1, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = deposit16(
+ env->vfp.vreg[src1].f16[j] ^
+ env->vfp.vreg[src2].f16[j],
+ 0,
+ 15,
+ env->vfp.vreg[src2].f16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = deposit32(
+ env->vfp.vreg[src1].f32[j] ^
+ env->vfp.vreg[src2].f32[j],
+ 0,
+ 31,
+ env->vfp.vreg[src2].f32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = deposit64(
+ env->vfp.vreg[src1].f64[j] ^
+ env->vfp.vreg[src2].f64[j],
+ 0,
+ 63,
+ env->vfp.vreg[src2].f64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ env->vfp.vreg[dest].f16[j] = 0;
+ case 32:
+ env->vfp.vreg[dest].f32[j] = 0;
+ case 64:
+ env->vfp.vreg[dest].f64[j] = 0;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ return;
+
+ env->vfp.vstart = 0;
+}
+
+/* vfsgnjx.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vfsgnjx_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = deposit16(
+ env->fpr[rs1] ^
+ env->vfp.vreg[src2].f16[j],
+ 0,
+ 15,
+ env->vfp.vreg[src2].f16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = deposit32(
+ env->fpr[rs1] ^
+ env->vfp.vreg[src2].f32[j],
+ 0,
+ 31,
+ env->vfp.vreg[src2].f32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = deposit64(
+ env->fpr[rs1] ^
+ env->vfp.vreg[src2].f64[j],
+ 0,
+ 63,
+ env->vfp.vreg[src2].f64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ env->vfp.vreg[dest].f16[j] = 0;
+ case 32:
+ env->vfp.vreg[dest].f32[j] = 0;
+ case 64:
+ env->vfp.vreg[dest].f64[j] = 0;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ return;
+
+ env->vfp.vstart = 0;
+}
+
+/* vfmerge.vfm vd, vs2, rs1, v0 # vd[i] = v0[i].LSB ? f[rs1] : vs2[i] */
+void VECTOR_HELPER(vfmerge_vfm)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ /* vfmv.v.f vd, rs1 # vd[i] = f[rs1]; */
+ if (vm && (rs2 != 0)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = env->fpr[rs1];
+ } else {
+ env->vfp.vreg[dest].f16[j] = env->vfp.vreg[src2].f16[j];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = env->fpr[rs1];
+ } else {
+ env->vfp.vreg[dest].f32[j] = env->vfp.vreg[src2].f32[j];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = env->fpr[rs1];
+ } else {
+ env->vfp.vreg[dest].f64[j] = env->vfp.vreg[src2].f64[j];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmfeq.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vmfeq_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src1, src2, result, r;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ r = float16_compare_quiet(env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ if (r == float_relation_equal) {
+ result = 1;
+ } else {
+ result = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float32_eq_quiet(env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float64_eq_quiet(env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ case 32:
+ case 64:
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmfeq.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vmfeq_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2, result, r;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ r = float16_compare_quiet(env->fpr[rs1],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ if (r == float_relation_equal) {
+ result = 1;
+ } else {
+ result = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float32_eq_quiet(env->fpr[rs1],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float64_eq_quiet(env->fpr[rs1],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ case 32:
+ case 64:
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmfne.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vmfne_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src1, src2, result, r;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ r = float16_compare_quiet(env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ if (r != float_relation_equal) {
+ result = 1;
+ } else {
+ result = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float32_eq_quiet(env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, !result);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float64_eq_quiet(env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, !result);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ case 32:
+ case 64:
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmfne.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vmfne_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2, result, r;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ r = float16_compare_quiet(env->fpr[rs1],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ if (r != float_relation_equal) {
+ result = 1;
+ } else {
+ result = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float32_eq_quiet(env->fpr[rs1],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, !result);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float64_eq_quiet(env->fpr[rs1],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, !result);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ case 32:
+ case 64:
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmflt.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vmflt_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src1, src2, result, r;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ r = float16_compare(env->vfp.vreg[src2].f16[j],
+ env->vfp.vreg[src1].f16[j],
+ &env->fp_status);
+ if (r == float_relation_less) {
+ result = 1;
+ } else {
+ result = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float32_lt(env->vfp.vreg[src2].f32[j],
+ env->vfp.vreg[src1].f32[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float64_lt(env->vfp.vreg[src2].f64[j],
+ env->vfp.vreg[src1].f64[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ case 32:
+ case 64:
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmflt.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vmflt_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2, result, r;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ r = float16_compare(env->vfp.vreg[src2].f16[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ if (r == float_relation_less) {
+ result = 1;
+ } else {
+ result = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float32_lt(env->vfp.vreg[src2].f32[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float64_lt(env->vfp.vreg[src2].f64[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ case 32:
+ case 64:
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmfle.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vmfle_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src1, src2, result, r;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ r = float16_compare(env->vfp.vreg[src2].f16[j],
+ env->vfp.vreg[src1].f16[j],
+ &env->fp_status);
+ if ((r == float_relation_less) ||
+ (r == float_relation_equal)) {
+ result = 1;
+ } else {
+ result = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float32_le(env->vfp.vreg[src2].f32[j],
+ env->vfp.vreg[src1].f32[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float64_le(env->vfp.vreg[src2].f64[j],
+ env->vfp.vreg[src1].f64[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ case 32:
+ case 64:
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmfle.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vmfle_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2, result, r;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ r = float16_compare(env->vfp.vreg[src2].f16[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ if ((r == float_relation_less) ||
+ (r == float_relation_equal)) {
+ result = 1;
+ } else {
+ result = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float32_le(env->vfp.vreg[src2].f32[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float64_le(env->vfp.vreg[src2].f64[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ case 32:
+ case 64:
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmfgt.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vmfgt_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2, result, r;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ switch (width) {
+ case 16:
+ r = float16_compare(env->vfp.vreg[src2].f16[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ break;
+ case 32:
+ r = float32_compare(env->vfp.vreg[src2].f32[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ break;
+ case 64:
+ r = float64_compare(env->vfp.vreg[src2].f64[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ break;
+ default:
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (r == float_relation_greater) {
+ result = 1;
+ } else {
+ result = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ } else {
+ switch (width) {
+ case 16:
+ case 32:
+ case 64:
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmfge.vf vd, vs2, rs1, vm # vector-scalar */
+void VECTOR_HELPER(vmfge_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2, result, r;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ switch (width) {
+ case 16:
+ r = float16_compare(env->vfp.vreg[src2].f16[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ break;
+ case 32:
+ r = float32_compare(env->vfp.vreg[src2].f32[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ break;
+ case 64:
+ r = float64_compare(env->vfp.vreg[src2].f64[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ break;
+ default:
+ riscv_raise_exception(env,
+ RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if ((r == float_relation_greater) ||
+ (r == float_relation_equal)) {
+ result = 1;
+ } else {
+ result = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, result);
+ }
+ } else {
+ switch (width) {
+ case 16:
+ case 32:
+ case 64:
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmford.vv vd, vs2, vs1, vm # Vector-vector */
+void VECTOR_HELPER(vmford_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src1, src2, result, r;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ r = float16_compare_quiet(env->vfp.vreg[src1].f16[j],
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ if (r == float_relation_unordered) {
+ result = 1;
+ } else {
+ result = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, !result);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float32_unordered_quiet(env->vfp.vreg[src1].f32[j],
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, !result);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float64_unordered_quiet(env->vfp.vreg[src1].f64[j],
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, !result);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ case 32:
+ case 64:
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmford.vf vd, vs2, rs1, vm # Vector-scalar */
+void VECTOR_HELPER(vmford_vf)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2, result, r;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ r = float16_compare_quiet(env->vfp.vreg[src2].f16[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ if (r == float_relation_unordered) {
+ result = 1;
+ } else {
+ result = 0;
+ }
+ vector_mask_result(env, rd, width, lmul, i, !result);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float32_unordered_quiet(env->vfp.vreg[src2].f32[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, !result);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ result = float64_unordered_quiet(env->vfp.vreg[src2].f64[j],
+ env->fpr[rs1],
+ &env->fp_status);
+ vector_mask_result(env, rd, width, lmul, i, !result);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ switch (width) {
+ case 16:
+ case 32:
+ case 64:
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfclass.v vd, vs2, vm # Vector-vector */
+void VECTOR_HELPER(vfclass_v)(CPURISCVState *env, uint32_t vm, uint32_t rs2,
+ uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = helper_fclass_h(
+ env->vfp.vreg[src2].f16[j]);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = helper_fclass_s(
+ env->vfp.vreg[src2].f32[j]);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = helper_fclass_d(
+ env->vfp.vreg[src2].f64[j]);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfcvt.xu.f.v vd, vs2, vm # Convert float to unsigned integer. */
+void VECTOR_HELPER(vfcvt_xu_f_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = float16_to_uint16(
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = float32_to_uint32(
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = float64_to_uint64(
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfcvt.x.f.v vd, vs2, vm # Convert float to signed integer. */
+void VECTOR_HELPER(vfcvt_x_f_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[j] = float16_to_int16(
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[j] = float32_to_int32(
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[j] = float64_to_int64(
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfcvt.f.xu.v vd, vs2, vm # Convert unsigned integer to float. */
+void VECTOR_HELPER(vfcvt_f_xu_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = uint16_to_float16(
+ env->vfp.vreg[src2].u16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = uint32_to_float32(
+ env->vfp.vreg[src2].u32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = uint64_to_float64(
+ env->vfp.vreg[src2].u64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfcvt.f.x.v vd, vs2, vm # Convert integer to float. */
+void VECTOR_HELPER(vfcvt_f_x_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[j] = int16_to_float16(
+ env->vfp.vreg[src2].s16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[j] = int32_to_float32(
+ env->vfp.vreg[src2].s32[j],
+ &env->fp_status);
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[j] = int64_to_float64(
+ env->vfp.vreg[src2].s64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fcommon(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfwcvt.xu.f.v vd, vs2, vm # Convert float to double-width unsigned integer.*/
+void VECTOR_HELPER(vfwcvt_xu_f_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (lmul > 4) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] = float16_to_uint32(
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[k] = float32_to_uint64(
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ }
+ } else {
+ vector_tail_fwiden(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfwcvt.x.f.v vd, vs2, vm # Convert float to double-width signed integer. */
+void VECTOR_HELPER(vfwcvt_x_f_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (lmul > 4) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] = float16_to_int32(
+ env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s64[k] = float32_to_int64(
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fwiden(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfwcvt.f.xu.v vd, vs2, vm # Convert unsigned integer to double-width float */
+void VECTOR_HELPER(vfwcvt_f_xu_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (lmul > 4) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = uint16_to_float32(
+ env->vfp.vreg[src2].u16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[k] = uint32_to_float64(
+ env->vfp.vreg[src2].u32[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fwiden(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfwcvt.f.x.v vd, vs2, vm # Convert integer to double-width float. */
+void VECTOR_HELPER(vfwcvt_f_x_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (lmul > 4) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = int16_to_float32(
+ env->vfp.vreg[src2].s16[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[k] = int32_to_float64(
+ env->vfp.vreg[src2].s32[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fwiden(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/*
+ * vfwcvt.f.f.v vd, vs2, vm #
+ * Convert single-width float to double-width float.
+ */
+void VECTOR_HELPER(vfwcvt_f_f_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, 2 * lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, true);
+
+ if (lmul > 4) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / (2 * width)));
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ k = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = float16_to_float32(
+ env->vfp.vreg[src2].f16[j],
+ true,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f64[k] = float32_to_float64(
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fwiden(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfncvt.xu.f.v vd, vs2, vm # Convert float to unsigned integer. */
+void VECTOR_HELPER(vfncvt_xu_f_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+ if (vector_vtype_ill(env) ||
+ vector_overlap_vm_common(lmul, vm, rd) ||
+ vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (lmul > 4) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ k = i % (VLEN / width);
+ j = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[k] = float32_to_uint16(
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[k] = float64_to_uint32(
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fnarrow(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfncvt.x.f.v vd, vs2, vm # Convert double-width float to signed integer. */
+void VECTOR_HELPER(vfncvt_x_f_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+ if (vector_vtype_ill(env) ||
+ vector_overlap_vm_common(lmul, vm, rd) ||
+ vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (lmul > 4) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ k = i % (VLEN / width);
+ j = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s16[k] = float32_to_int16(
+ env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].s32[k] = float64_to_int32(
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fnarrow(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfncvt.f.xu.v vd, vs2, vm # Convert double-width unsigned integer to float */
+void VECTOR_HELPER(vfncvt_f_xu_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) ||
+ vector_overlap_vm_common(lmul, vm, rd) ||
+ vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (lmul > 4) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ k = i % (VLEN / width);
+ j = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[k] = uint32_to_float16(
+ env->vfp.vreg[src2].u32[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = uint64_to_float32(
+ env->vfp.vreg[src2].u64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fnarrow(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfncvt.f.x.v vd, vs2, vm # Convert double-width integer to float. */
+void VECTOR_HELPER(vfncvt_f_x_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+ if (vector_vtype_ill(env) ||
+ vector_overlap_vm_common(lmul, vm, rd) ||
+ vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (lmul > 4) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ k = i % (VLEN / width);
+ j = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[k] = int32_to_float16(
+ env->vfp.vreg[src2].s32[j],
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = int64_to_float32(
+ env->vfp.vreg[src2].s64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fnarrow(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfncvt.f.f.v vd, vs2, vm # Convert double float to single-width float. */
+void VECTOR_HELPER(vfncvt_f_f_v)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, k, dest, src2;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+ if (vector_vtype_ill(env) ||
+ vector_overlap_vm_common(lmul, vm, rd) ||
+ vector_overlap_dstgp_srcgp(rd, lmul, rs2, 2 * lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, true);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (lmul > 4) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src2 = rs2 + (i / (VLEN / (2 * width)));
+ k = i % (VLEN / width);
+ j = i % (VLEN / (2 * width));
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f16[k] = float32_to_float16(
+ env->vfp.vreg[src2].f32[j],
+ true,
+ &env->fp_status);
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].f32[k] = float64_to_float32(
+ env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_fnarrow(env, dest, k, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 15/17] RISC-V: add vector extension reduction instructions
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (13 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 14/17] RISC-V: add vector extension float instructions part2, sqrt/cmp/cvt/others liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-12 16:54 ` Richard Henderson
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 16/17] RISC-V: add vector extension mask instructions liuzhiwei
` (2 subsequent siblings)
17 siblings, 1 reply; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/helper.h | 17 +
target/riscv/insn32.decode | 17 +
target/riscv/insn_trans/trans_rvv.inc.c | 17 +
target/riscv/vector_helper.c | 1275 +++++++++++++++++++++++++++++++
4 files changed, 1326 insertions(+)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index e2384eb..d36bd00 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -384,5 +384,22 @@ DEF_HELPER_4(vector_vfncvt_f_xu_v, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vfncvt_f_x_v, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vfncvt_f_f_v, void, env, i32, i32, i32)
+DEF_HELPER_5(vector_vredsum_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vredand_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfredsum_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vredor_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vredxor_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfredosum_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vredminu_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vredmin_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfredmin_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vredmaxu_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vredmax_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfredmax_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwredsumu_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vwredsum_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfwredsum_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vfwredosum_vs, void, env, i32, i32, i32, i32)
+
DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 256d8ea..3f63bc1 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -524,5 +524,22 @@ vfncvt_f_xu_v 100010 . ..... 10010 001 ..... 1010111 @r2_vm
vfncvt_f_x_v 100010 . ..... 10011 001 ..... 1010111 @r2_vm
vfncvt_f_f_v 100010 . ..... 10100 001 ..... 1010111 @r2_vm
+vredsum_vs 000000 . ..... ..... 010 ..... 1010111 @r_vm
+vredand_vs 000001 . ..... ..... 010 ..... 1010111 @r_vm
+vredor_vs 000010 . ..... ..... 010 ..... 1010111 @r_vm
+vredxor_vs 000011 . ..... ..... 010 ..... 1010111 @r_vm
+vredminu_vs 000100 . ..... ..... 010 ..... 1010111 @r_vm
+vredmin_vs 000101 . ..... ..... 010 ..... 1010111 @r_vm
+vredmaxu_vs 000110 . ..... ..... 010 ..... 1010111 @r_vm
+vredmax_vs 000111 . ..... ..... 010 ..... 1010111 @r_vm
+vwredsumu_vs 110000 . ..... ..... 000 ..... 1010111 @r_vm
+vwredsum_vs 110001 . ..... ..... 000 ..... 1010111 @r_vm
+vfredsum_vs 000001 . ..... ..... 001 ..... 1010111 @r_vm
+vfredosum_vs 000011 . ..... ..... 001 ..... 1010111 @r_vm
+vfredmin_vs 000101 . ..... ..... 001 ..... 1010111 @r_vm
+vfredmax_vs 000111 . ..... ..... 001 ..... 1010111 @r_vm
+vfwredsum_vs 110001 . ..... ..... 001 ..... 1010111 @r_vm
+vfwredosum_vs 110011 . ..... ..... 001 ..... 1010111 @r_vm
+
vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index e4d4576..9a3d31b 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -427,5 +427,22 @@ GEN_VECTOR_R2_VM(vfncvt_f_xu_v)
GEN_VECTOR_R2_VM(vfncvt_f_x_v)
GEN_VECTOR_R2_VM(vfncvt_f_f_v)
+GEN_VECTOR_R_VM(vredsum_vs)
+GEN_VECTOR_R_VM(vredand_vs)
+GEN_VECTOR_R_VM(vredor_vs)
+GEN_VECTOR_R_VM(vredxor_vs)
+GEN_VECTOR_R_VM(vredminu_vs)
+GEN_VECTOR_R_VM(vredmin_vs)
+GEN_VECTOR_R_VM(vredmaxu_vs)
+GEN_VECTOR_R_VM(vredmax_vs)
+GEN_VECTOR_R_VM(vwredsumu_vs)
+GEN_VECTOR_R_VM(vwredsum_vs)
+GEN_VECTOR_R_VM(vfredsum_vs)
+GEN_VECTOR_R_VM(vfredosum_vs)
+GEN_VECTOR_R_VM(vfredmin_vs)
+GEN_VECTOR_R_VM(vfredmax_vs)
+GEN_VECTOR_R_VM(vfwredsum_vs)
+GEN_VECTOR_R_VM(vfwredosum_vs)
+
GEN_VECTOR_R2_ZIMM(vsetvli)
GEN_VECTOR_R(vsetvl)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index fd2ecb7..4a9083b 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -22720,4 +22720,1279 @@ void VECTOR_HELPER(vfncvt_f_f_v)(CPURISCVState *env, uint32_t vm,
return;
}
+/* vredsum.vs vd, vs2, vs1, vm # vd[0] = sum(vs1[0] , vs2[*]) */
+void VECTOR_HELPER(vredsum_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ uint64_t sum = 0;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum += env->vfp.vreg[src2].u8[j];
+ }
+ if (i == 0) {
+ sum += env->vfp.vreg[rs1].u8[0];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u8[0] = sum;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum += env->vfp.vreg[src2].u16[j];
+ }
+ if (i == 0) {
+ sum += env->vfp.vreg[rs1].u16[0];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u16[0] = sum;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum += env->vfp.vreg[src2].u32[j];
+ }
+ if (i == 0) {
+ sum += env->vfp.vreg[rs1].u32[0];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u32[0] = sum;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum += env->vfp.vreg[src2].u64[j];
+ }
+ if (i == 0) {
+ sum += env->vfp.vreg[rs1].u64[0];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u64[0] = sum;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+
+/* vredand.vs vd, vs2, vs1, vm # vd[0] = and( vs1[0] , vs2[*] ) */
+void VECTOR_HELPER(vredand_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ uint64_t res = 0;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 8:
+ if (i == 0) {
+ res = env->vfp.vreg[rs1].u8[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ res &= env->vfp.vreg[src2].u8[j];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u8[0] = res;
+ }
+ break;
+ case 16:
+ if (i == 0) {
+ res = env->vfp.vreg[rs1].u16[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ res &= env->vfp.vreg[src2].u16[j];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u16[0] = res;
+ }
+ break;
+ case 32:
+ if (i == 0) {
+ res = env->vfp.vreg[rs1].u32[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ res &= env->vfp.vreg[src2].u32[j];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u32[0] = res;
+ }
+ break;
+ case 64:
+ if (i == 0) {
+ res = env->vfp.vreg[rs1].u64[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ res &= env->vfp.vreg[src2].u64[j];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u64[0] = res;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfredsum.vs vd, vs2, vs1, vm # Unordered sum */
+void VECTOR_HELPER(vfredsum_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ float16 sum16 = 0.0f;
+ float32 sum32 = 0.0f;
+ float64 sum64 = 0.0f;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 16:
+ if (i == 0) {
+ sum16 = env->vfp.vreg[rs1].f16[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum16 = float16_add(sum16, env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].f16[0] = sum16;
+ }
+ break;
+ case 32:
+ if (i == 0) {
+ sum32 = env->vfp.vreg[rs1].f32[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum32 = float32_add(sum32, env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].f32[0] = sum32;
+ }
+ break;
+ case 64:
+ if (i == 0) {
+ sum64 = env->vfp.vreg[rs1].f64[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum64 = float64_add(sum64, env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].f64[0] = sum64;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vredor.vs vd, vs2, vs1, vm # vd[0] = or( vs1[0] , vs2[*] ) */
+void VECTOR_HELPER(vredor_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ uint64_t res = 0;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 8:
+ if (i == 0) {
+ res = env->vfp.vreg[rs1].u8[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ res |= env->vfp.vreg[src2].u8[j];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u8[0] = res;
+ }
+ break;
+ case 16:
+ if (i == 0) {
+ res = env->vfp.vreg[rs1].u16[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ res |= env->vfp.vreg[src2].u16[j];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u16[0] = res;
+ }
+ break;
+ case 32:
+ if (i == 0) {
+ res = env->vfp.vreg[rs1].u32[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ res |= env->vfp.vreg[src2].u32[j];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u32[0] = res;
+ }
+ break;
+ case 64:
+ if (i == 0) {
+ res = env->vfp.vreg[rs1].u64[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ res |= env->vfp.vreg[src2].u64[j];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u64[0] = res;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vredxor.vs vd, vs2, vs1, vm # vd[0] = xor( vs1[0] , vs2[*] ) */
+void VECTOR_HELPER(vredxor_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ uint64_t res = 0;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 8:
+ if (i == 0) {
+ res = env->vfp.vreg[rs1].u8[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ res ^= env->vfp.vreg[src2].u8[j];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u8[0] = res;
+ }
+ break;
+ case 16:
+ if (i == 0) {
+ res = env->vfp.vreg[rs1].u16[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ res ^= env->vfp.vreg[src2].u16[j];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u16[0] = res;
+ }
+ break;
+ case 32:
+ if (i == 0) {
+ res = env->vfp.vreg[rs1].u32[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ res ^= env->vfp.vreg[src2].u32[j];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u32[0] = res;
+ }
+ break;
+ case 64:
+ if (i == 0) {
+ res = env->vfp.vreg[rs1].u64[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ res ^= env->vfp.vreg[src2].u64[j];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u64[0] = res;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfredosum.vs vd, vs2, vs1, vm # Ordered sum */
+void VECTOR_HELPER(vfredosum_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ helper_vector_vfredsum_vs(env, vm, rs1, rs2, rd);
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vredminu.vs vd, vs2, vs1, vm # vd[0] = minu( vs1[0] , vs2[*] ) */
+void VECTOR_HELPER(vredminu_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ uint64_t minu = 0;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 8:
+ if (i == 0) {
+ minu = env->vfp.vreg[rs1].u8[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (minu > env->vfp.vreg[src2].u8[j]) {
+ minu = env->vfp.vreg[src2].u8[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u8[0] = minu;
+ }
+ break;
+ case 16:
+ if (i == 0) {
+ minu = env->vfp.vreg[rs1].u16[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (minu > env->vfp.vreg[src2].u16[j]) {
+ minu = env->vfp.vreg[src2].u16[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u16[0] = minu;
+ }
+ break;
+ case 32:
+ if (i == 0) {
+ minu = env->vfp.vreg[rs1].u32[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (minu > env->vfp.vreg[src2].u32[j]) {
+ minu = env->vfp.vreg[src2].u32[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u32[0] = minu;
+ }
+ break;
+ case 64:
+ if (i == 0) {
+ minu = env->vfp.vreg[rs1].u64[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (minu > env->vfp.vreg[src2].u64[j]) {
+ minu = env->vfp.vreg[src2].u64[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u64[0] = minu;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vredmin.vs vd, vs2, vs1, vm # vd[0] = min( vs1[0] , vs2[*] ) */
+void VECTOR_HELPER(vredmin_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ int64_t min = 0;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 8:
+ if (i == 0) {
+ min = env->vfp.vreg[rs1].s8[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (min > env->vfp.vreg[src2].s8[j]) {
+ min = env->vfp.vreg[src2].s8[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].s8[0] = min;
+ }
+ break;
+ case 16:
+ if (i == 0) {
+ min = env->vfp.vreg[rs1].s16[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (min > env->vfp.vreg[src2].s16[j]) {
+ min = env->vfp.vreg[src2].s16[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].s16[0] = min;
+ }
+ break;
+ case 32:
+ if (i == 0) {
+ min = env->vfp.vreg[rs1].s32[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (min > env->vfp.vreg[src2].s32[j]) {
+ min = env->vfp.vreg[src2].s32[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].s32[0] = min;
+ }
+ break;
+ case 64:
+ if (i == 0) {
+ min = env->vfp.vreg[rs1].s64[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (min > env->vfp.vreg[src2].s64[j]) {
+ min = env->vfp.vreg[src2].s64[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].s64[0] = min;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfredmin.vs vd, vs2, vs1, vm # Minimum value */
+void VECTOR_HELPER(vfredmin_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ float16 min16 = 0.0f;
+ float32 min32 = 0.0f;
+ float64 min64 = 0.0f;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 16:
+ if (i == 0) {
+ min16 = env->vfp.vreg[rs1].f16[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ min16 = float16_minnum(min16, env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].f16[0] = min16;
+ }
+ break;
+ case 32:
+ if (i == 0) {
+ min32 = env->vfp.vreg[rs1].f32[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ min32 = float32_minnum(min32, env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].f32[0] = min32;
+ }
+ break;
+ case 64:
+ if (i == 0) {
+ min64 = env->vfp.vreg[rs1].f64[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ min64 = float64_minnum(min64, env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].f64[0] = min64;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vredmaxu.vs vd, vs2, vs1, vm # vd[0] = maxu( vs1[0] , vs2[*] ) */
+void VECTOR_HELPER(vredmaxu_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ uint64_t maxu = 0;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 8:
+ if (i == 0) {
+ maxu = env->vfp.vreg[rs1].u8[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (maxu < env->vfp.vreg[src2].u8[j]) {
+ maxu = env->vfp.vreg[src2].u8[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u8[0] = maxu;
+ }
+ break;
+ case 16:
+ if (i == 0) {
+ maxu = env->vfp.vreg[rs1].u16[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (maxu < env->vfp.vreg[src2].u16[j]) {
+ maxu = env->vfp.vreg[src2].u16[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u16[0] = maxu;
+ }
+ break;
+ case 32:
+ if (i == 0) {
+ maxu = env->vfp.vreg[rs1].u32[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (maxu < env->vfp.vreg[src2].u32[j]) {
+ maxu = env->vfp.vreg[src2].u32[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u32[0] = maxu;
+ }
+ break;
+ case 64:
+ if (i == 0) {
+ maxu = env->vfp.vreg[rs1].u64[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (maxu < env->vfp.vreg[src2].u64[j]) {
+ maxu = env->vfp.vreg[src2].u64[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u64[0] = maxu;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+/* vredmax.vs vd, vs2, vs1, vm # vd[0] = max( vs1[0] , vs2[*] ) */
+void VECTOR_HELPER(vredmax_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ int64_t max = 0;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 8:
+ if (i == 0) {
+ max = env->vfp.vreg[rs1].s8[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (max < env->vfp.vreg[src2].s8[j]) {
+ max = env->vfp.vreg[src2].s8[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].s8[0] = max;
+ }
+ break;
+ case 16:
+ if (i == 0) {
+ max = env->vfp.vreg[rs1].s16[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (max < env->vfp.vreg[src2].s16[j]) {
+ max = env->vfp.vreg[src2].s16[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].s16[0] = max;
+ }
+ break;
+ case 32:
+ if (i == 0) {
+ max = env->vfp.vreg[rs1].s32[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (max < env->vfp.vreg[src2].s32[j]) {
+ max = env->vfp.vreg[src2].s32[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].s32[0] = max;
+ }
+ break;
+ case 64:
+ if (i == 0) {
+ max = env->vfp.vreg[rs1].s64[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (max < env->vfp.vreg[src2].s64[j]) {
+ max = env->vfp.vreg[src2].s64[j];
+ }
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].s64[0] = max;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfredmax.vs vd, vs2, vs1, vm # Maximum value */
+void VECTOR_HELPER(vfredmax_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ float16 max16 = 0.0f;
+ float32 max32 = 0.0f;
+ float64 max64 = 0.0f;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 16:
+ if (i == 0) {
+ max16 = env->vfp.vreg[rs1].f16[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ max16 = float16_maxnum(max16, env->vfp.vreg[src2].f16[j],
+ &env->fp_status);
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].f16[0] = max16;
+ }
+ break;
+ case 32:
+ if (i == 0) {
+ max32 = env->vfp.vreg[rs1].f32[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ max32 = float32_maxnum(max32, env->vfp.vreg[src2].f32[j],
+ &env->fp_status);
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].f32[0] = max32;
+ }
+ break;
+ case 64:
+ if (i == 0) {
+ max64 = env->vfp.vreg[rs1].f64[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ max64 = float64_maxnum(max64, env->vfp.vreg[src2].f64[j],
+ &env->fp_status);
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].f64[0] = max64;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vwredsumu.vs vd, vs2, vs1, vm # 2*SEW = 2*SEW + sum(zero-extend(SEW)) */
+void VECTOR_HELPER(vwredsumu_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ uint64_t sum = 0;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum += env->vfp.vreg[src2].u8[j];
+ }
+ if (i == 0) {
+ sum += env->vfp.vreg[rs1].u16[0];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u16[0] = sum;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum += env->vfp.vreg[src2].u16[j];
+ }
+ if (i == 0) {
+ sum += env->vfp.vreg[rs1].u32[0];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u32[0] = sum;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum += env->vfp.vreg[src2].u32[j];
+ }
+ if (i == 0) {
+ sum += env->vfp.vreg[rs1].u64[0];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].u64[0] = sum;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vwredsum.vs vd, vs2, vs1, vm # 2*SEW = 2*SEW + sum(sign-extend(SEW)) */
+void VECTOR_HELPER(vwredsum_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ int64_t sum = 0;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum += (int16_t)env->vfp.vreg[src2].s8[j] << 8 >> 8;
+ }
+ if (i == 0) {
+ sum += env->vfp.vreg[rs1].s16[0];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].s16[0] = sum;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum += (int32_t)env->vfp.vreg[src2].s16[j] << 16 >> 16;
+ }
+ if (i == 0) {
+ sum += env->vfp.vreg[rs1].s32[0];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].s32[0] = sum;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum += (int64_t)env->vfp.vreg[src2].s32[j] << 32 >> 32;
+ }
+ if (i == 0) {
+ sum += env->vfp.vreg[rs1].s64[0];
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].s64[0] = sum;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/*
+ * vfwredsum.vs vd, vs2, vs1, vm #
+ * Unordered reduce 2*SEW = 2*SEW + sum(promote(SEW))
+ */
+void VECTOR_HELPER(vfwredsum_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, src2;
+ float32 sum32 = 0.0f;
+ float64 sum64 = 0.0f;
+
+ lmul = vector_get_lmul(env);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vl = env->vfp.vl;
+ if (vl == 0) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < VLEN / 64; i++) {
+ env->vfp.vreg[rd].u64[i] = 0;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ src2 = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+
+ if (i < vl) {
+ switch (width) {
+ case 16:
+ if (i == 0) {
+ sum32 = env->vfp.vreg[rs1].f32[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum32 = float32_add(sum32,
+ float16_to_float32(env->vfp.vreg[src2].f16[j],
+ true, &env->fp_status),
+ &env->fp_status);
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].f32[0] = sum32;
+ }
+ break;
+ case 32:
+ if (i == 0) {
+ sum64 = env->vfp.vreg[rs1].f64[0];
+ }
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ sum64 = float64_add(sum64,
+ float32_to_float64(env->vfp.vreg[src2].f32[j],
+ &env->fp_status),
+ &env->fp_status);
+ }
+ if (i == vl - 1) {
+ env->vfp.vreg[rd].f64[0] = sum64;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/*
+ * vfwredosum.vs vd, vs2, vs1, vm #
+ * Ordered reduce 2*SEW = 2*SEW + sum(promote(SEW))
+ */
+void VECTOR_HELPER(vfwredosum_vs)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ helper_vector_vfwredsum_vs(env, vm, rs1, rs2, rd);
+ env->vfp.vstart = 0;
+ return;
+}
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 15/17] RISC-V: add vector extension reduction instructions
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 15/17] RISC-V: add vector extension reduction instructions liuzhiwei
@ 2019-09-12 16:54 ` Richard Henderson
0 siblings, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2019-09-12 16:54 UTC (permalink / raw)
To: liuzhiwei, Alistair.Francis, palmer, sagark, kbastian,
riku.voipio, laurent, wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768
On 9/11/19 2:25 AM, liuzhiwei wrote:
> +/* vredsum.vs vd, vs2, vs1, vm # vd[0] = sum(vs1[0] , vs2[*]) */
> +void VECTOR_HELPER(vredsum_vs)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
> + uint32_t rs2, uint32_t rd)
> +{
>
> + int width, lmul, vl, vlmax;
> + int i, j, src2;
> + uint64_t sum = 0;
> +
> + lmul = vector_get_lmul(env);
> + vector_lmul_check_reg(env, lmul, rs2, false);
> +
> + if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
> + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
> + return;
> + }
> + if (env->vfp.vstart != 0) {
> + riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
> + return;
> + }
> +
> + vl = env->vfp.vl;
> + if (vl == 0) {
> + return;
> + }
> +
> + width = vector_get_width(env);
> + vlmax = vector_get_vlmax(env);
> +
> + for (i = 0; i < VLEN / 64; i++) {
> + env->vfp.vreg[rd].u64[i] = 0;
> + }
> +
There is no requirement that I see for vd != vs1 && vd != vs2. Thus clearing
vd before the operation may clobber the inputs.
> + if (i < vl) {
> + switch (width) {
> + case 8:
> + if (vector_elem_mask(env, vm, width, lmul, i)) {
> + sum += env->vfp.vreg[src2].u8[j];
> + }
> + if (i == 0) {
> + sum += env->vfp.vreg[rs1].u8[0];
> + }
Hoist the rs1 case outside the loop.
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 16/17] RISC-V: add vector extension mask instructions
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (14 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 15/17] RISC-V: add vector extension reduction instructions liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-12 17:07 ` Richard Henderson
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 17/17] RISC-V: add vector extension premutation instructions liuzhiwei
2019-09-11 7:00 ` [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension Aleksandar Markovic
17 siblings, 1 reply; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/helper.h | 16 +
target/riscv/insn32.decode | 17 +
target/riscv/insn_trans/trans_rvv.inc.c | 27 ++
target/riscv/vector_helper.c | 635 ++++++++++++++++++++++++++++++++
4 files changed, 695 insertions(+)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index d36bd00..337ac2e 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -401,5 +401,21 @@ DEF_HELPER_5(vector_vwredsum_vs, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vfwredsum_vs, void, env, i32, i32, i32, i32)
DEF_HELPER_5(vector_vfwredosum_vs, void, env, i32, i32, i32, i32)
+DEF_HELPER_4(vector_vmandnot_mm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmand_mm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmor_mm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmxor_mm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmornot_mm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmnand_mm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmnor_mm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmxnor_mm, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmsbf_m, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmsof_m, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmsif_m, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_viota_m, void, env, i32, i32, i32)
+DEF_HELPER_3(vector_vid_v, void, env, i32, i32)
+DEF_HELPER_4(vector_vmpopc_m, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmfirst_m, void, env, i32, i32, i32)
+
DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 3f63bc1..1de776b 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -68,6 +68,7 @@
@r_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs2 %rs1 %rd
@r2_nfvm nf:3 ... vm:1 ..... ..... ... ..... ....... %rs1 %rd
@r2_vm ...... vm:1 ..... ..... ... ..... ....... %rs2 %rd
+@r1_vm ...... vm:1 ..... ..... ... ..... ....... %rd
@r2_zimm . zimm:11 ..... ... ..... ....... %rs1 %rd
@sfence_vma ....... ..... ..... ... ..... ....... %rs2 %rs1
@@ -541,5 +542,21 @@ vfredmax_vs 000111 . ..... ..... 001 ..... 1010111 @r_vm
vfwredsum_vs 110001 . ..... ..... 001 ..... 1010111 @r_vm
vfwredosum_vs 110011 . ..... ..... 001 ..... 1010111 @r_vm
+vmand_mm 011001 - ..... ..... 010 ..... 1010111 @r
+vmnand_mm 011101 - ..... ..... 010 ..... 1010111 @r
+vmandnot_mm 011000 - ..... ..... 010 ..... 1010111 @r
+vmor_mm 011010 - ..... ..... 010 ..... 1010111 @r
+vmxor_mm 011011 - ..... ..... 010 ..... 1010111 @r
+vmnor_mm 011110 - ..... ..... 010 ..... 1010111 @r
+vmornot_mm 011100 - ..... ..... 010 ..... 1010111 @r
+vmxnor_mm 011111 - ..... ..... 010 ..... 1010111 @r
+vmpopc_m 010100 . ..... ----- 010 ..... 1010111 @r2_vm
+vmfirst_m 010101 . ..... ----- 010 ..... 1010111 @r2_vm
+vmsbf_m 010110 . ..... 00001 010 ..... 1010111 @r2_vm
+vmsof_m 010110 . ..... 00010 010 ..... 1010111 @r2_vm
+vmsif_m 010110 . ..... 00011 010 ..... 1010111 @r2_vm
+viota_m 010110 . ..... 10000 010 ..... 1010111 @r2_vm
+vid_v 010110 . 00000 10001 010 ..... 1010111 @r1_vm
+
vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index 9a3d31b..85e435a 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -77,6 +77,17 @@ static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
return true; \
}
+#define GEN_VECTOR_R1_VM(INSN) \
+static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
+{ \
+ TCGv_i32 d = tcg_const_i32(a->rd); \
+ TCGv_i32 vm = tcg_const_i32(a->vm); \
+ gen_helper_vector_##INSN(cpu_env, vm, d); \
+ tcg_temp_free_i32(d); \
+ tcg_temp_free_i32(vm); \
+ return true; \
+}
+
#define GEN_VECTOR_R_VM(INSN) \
static bool trans_##INSN(DisasContext *ctx, arg_##INSN * a) \
{ \
@@ -444,5 +455,21 @@ GEN_VECTOR_R_VM(vfredmax_vs)
GEN_VECTOR_R_VM(vfwredsum_vs)
GEN_VECTOR_R_VM(vfwredosum_vs)
+GEN_VECTOR_R(vmandnot_mm)
+GEN_VECTOR_R(vmand_mm)
+GEN_VECTOR_R(vmor_mm)
+GEN_VECTOR_R(vmxor_mm)
+GEN_VECTOR_R(vmornot_mm)
+GEN_VECTOR_R(vmnand_mm)
+GEN_VECTOR_R(vmnor_mm)
+GEN_VECTOR_R(vmxnor_mm)
+GEN_VECTOR_R2_VM(vmpopc_m)
+GEN_VECTOR_R2_VM(vmfirst_m)
+GEN_VECTOR_R2_VM(vmsbf_m)
+GEN_VECTOR_R2_VM(vmsof_m)
+GEN_VECTOR_R2_VM(vmsif_m)
+GEN_VECTOR_R2_VM(viota_m)
+GEN_VECTOR_R1_VM(vid_v)
+
GEN_VECTOR_R2_ZIMM(vsetvli)
GEN_VECTOR_R(vsetvl)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 4a9083b..9e15df9 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -1232,6 +1232,15 @@ static inline int vector_get_carry(CPURISCVState *env, int width, int lmul,
return (env->vfp.vreg[0].u8[idx] >> pos) & 0x1;
}
+static inline int vector_mask_reg(CPURISCVState *env, uint32_t reg, int width,
+ int lmul, int index)
+{
+ int mlen = width / lmul;
+ int idx = (index * mlen) / 8;
+ int pos = (index * mlen) % 8;
+ return (env->vfp.vreg[reg].u8[idx] >> pos) & 0x1;
+}
+
static inline void vector_mask_result(CPURISCVState *env, uint32_t reg,
int width, int lmul, int index, uint32_t result)
{
@@ -23996,3 +24005,629 @@ void VECTOR_HELPER(vfwredosum_vs)(CPURISCVState *env, uint32_t vm,
env->vfp.vstart = 0;
return;
}
+
+/* vmandnot.mm vd, vs2, vs1 # vd = vs2 & ~vs1 */
+void VECTOR_HELPER(vmandnot_mm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, i, vlmax;
+ uint32_t tmp;
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ tmp = ~vector_mask_reg(env, rs1, width, lmul, i) &
+ vector_mask_reg(env, rs2, width, lmul, i);
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+
+ env->vfp.vstart = 0;
+ return;
+}
+/* vmand.mm vd, vs2, vs1 # vd = vs2 & vs1 */
+void VECTOR_HELPER(vmand_mm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, i, vlmax;
+ uint32_t tmp;
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ tmp = vector_mask_reg(env, rs1, width, lmul, i) &
+ vector_mask_reg(env, rs2, width, lmul, i);
+ vector_mask_result(env, rd, width, lmul, i, tmp);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+
+ env->vfp.vstart = 0;
+ return;
+}
+/* vmor.mm vd, vs2, vs1 # vd = vs2 | vs1 */
+void VECTOR_HELPER(vmor_mm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, i, vlmax;
+ uint32_t tmp;
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ tmp = vector_mask_reg(env, rs1, width, lmul, i) |
+ vector_mask_reg(env, rs2, width, lmul, i);
+ vector_mask_result(env, rd, width, lmul, i, tmp & 0x1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+
+ env->vfp.vstart = 0;
+ return;
+}
+/* vmxor.mm vd, vs2, vs1 # vd = vs2 ^ vs1 */
+void VECTOR_HELPER(vmxor_mm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, i, vlmax;
+ uint32_t tmp;
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ tmp = vector_mask_reg(env, rs1, width, lmul, i) ^
+ vector_mask_reg(env, rs2, width, lmul, i);
+ vector_mask_result(env, rd, width, lmul, i, tmp & 0x1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+
+ env->vfp.vstart = 0;
+ return;
+}
+/* vmornot.mm vd, vs2, vs1 # vd = vs2 | ~vs1 */
+void VECTOR_HELPER(vmornot_mm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, i, vlmax;
+ uint32_t tmp;
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ tmp = ~vector_mask_reg(env, rs1, width, lmul, i) |
+ vector_mask_reg(env, rs2, width, lmul, i);
+ vector_mask_result(env, rd, width, lmul, i, tmp & 0x1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+
+ env->vfp.vstart = 0;
+ return;
+}
+/* vmnand.mm vd, vs2, vs1 # vd = ~(vs2 & vs1) */
+void VECTOR_HELPER(vmnand_mm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, i, vlmax;
+ uint32_t tmp;
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ tmp = vector_mask_reg(env, rs1, width, lmul, i) &
+ vector_mask_reg(env, rs2, width, lmul, i);
+ vector_mask_result(env, rd, width, lmul, i, (~tmp & 0x1));
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+
+ env->vfp.vstart = 0;
+ return;
+}
+/* vmnor.mm vd, vs2, vs1 # vd = ~(vs2 | vs1) */
+void VECTOR_HELPER(vmnor_mm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, i, vlmax;
+ uint32_t tmp;
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ tmp = vector_mask_reg(env, rs1, width, lmul, i) |
+ vector_mask_reg(env, rs2, width, lmul, i);
+ vector_mask_result(env, rd, width, lmul, i, ~tmp & 0x1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmxnor.mm vd, vs2, vs1 # vd = ~(vs2 ^ vs1) */
+void VECTOR_HELPER(vmxnor_mm)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, i, vlmax;
+ uint32_t tmp;
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ vl = env->vfp.vl;
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ tmp = vector_mask_reg(env, rs1, width, lmul, i) ^
+ vector_mask_reg(env, rs2, width, lmul, i);
+ vector_mask_result(env, rd, width, lmul, i, ~tmp & 0x1);
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmpopc.m rd, vs2, v0.t # x[rd] = sum_i ( vs2[i].LSB && v0[i].LSB ) */
+void VECTOR_HELPER(vmpopc_m)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i;
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ env->gpr[rd] = 0;
+
+ for (i = 0; i < vlmax; i++) {
+ if (i < vl) {
+ if (vector_mask_reg(env, rs2, width, lmul, i) &&
+ vector_elem_mask(env, vm, width, lmul, i)) {
+ env->gpr[rd]++;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmfirst.m rd, vs2, vm */
+void VECTOR_HELPER(vmfirst_m)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i;
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ if (i < vl) {
+ if (vector_mask_reg(env, rs2, width, lmul, i) &&
+ vector_elem_mask(env, vm, width, lmul, i)) {
+ env->gpr[rd] = i;
+ break;
+ }
+ } else {
+ env->gpr[rd] = -1;
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmsbf.m vd, vs2, vm # set-before-first mask bit */
+void VECTOR_HELPER(vmsbf_m)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i;
+ bool first_mask_bit = false;
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ if (i < vl) {
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (first_mask_bit) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ continue;
+ }
+ if (!vector_mask_reg(env, rs2, width, lmul, i)) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ first_mask_bit = true;
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmsif.m vd, vs2, vm # set-including-first mask bit */
+void VECTOR_HELPER(vmsif_m)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i;
+ bool first_mask_bit = false;
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ if (i < vl) {
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (first_mask_bit) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ continue;
+ }
+ if (!vector_mask_reg(env, rs2, width, lmul, i)) {
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ } else {
+ first_mask_bit = true;
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ }
+ }
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmsof.m vd, vs2, vm # set-only-first mask bit */
+void VECTOR_HELPER(vmsof_m)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i;
+ bool first_mask_bit = false;
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ if (i < vl) {
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (first_mask_bit) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ continue;
+ }
+ if (!vector_mask_reg(env, rs2, width, lmul, i)) {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ } else {
+ first_mask_bit = true;
+ vector_mask_result(env, rd, width, lmul, i, 1);
+ }
+ }
+ } else {
+ vector_mask_result(env, rd, width, lmul, i, 0);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* viota.m v4, v2, v0.t */
+void VECTOR_HELPER(viota_m)(CPURISCVState *env, uint32_t vm, uint32_t rs2,
+ uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest;
+ uint32_t sum = 0;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, 1)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = sum;
+ if (vector_mask_reg(env, rs2, width, lmul, i)) {
+ sum++;
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = sum;
+ if (vector_mask_reg(env, rs2, width, lmul, i)) {
+ sum++;
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = sum;
+ if (vector_mask_reg(env, rs2, width, lmul, i)) {
+ sum++;
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = sum;
+ if (vector_mask_reg(env, rs2, width, lmul, i)) {
+ sum++;
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vid.v vd, vm # Write element ID to destination. */
+void VECTOR_HELPER(vid_v)(CPURISCVState *env, uint32_t vm, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_common(lmul, vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rd, false);
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = i;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = i;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = i;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = i;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 16/17] RISC-V: add vector extension mask instructions
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 16/17] RISC-V: add vector extension mask instructions liuzhiwei
@ 2019-09-12 17:07 ` Richard Henderson
0 siblings, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2019-09-12 17:07 UTC (permalink / raw)
To: liuzhiwei, Alistair.Francis, palmer, sagark, kbastian,
riku.voipio, laurent, wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768
On 9/11/19 2:25 AM, liuzhiwei wrote:
> + for (i = 0; i < vlmax; i++) {
> + if (i < env->vfp.vstart) {
> + continue;
> + } else if (i < vl) {
> + tmp = ~vector_mask_reg(env, rs1, width, lmul, i) &
> + vector_mask_reg(env, rs2, width, lmul, i);
> + vector_mask_result(env, rd, width, lmul, i, tmp);
> + } else {
> + vector_mask_result(env, rd, width, lmul, i, 0);
> + }
> + }
These can be processed in uint64_t units, with a mask based on width:
8: 0xffffffffffffffff
16: 0x5555555555555555
32: 0x1111111111111111
64: 0x0101010101010101
dest = ~in1 & in2 & mask;
with an additional final mask to handle vl not being a multiple of 64.
Again, I urge you not to bother with impossible vstart -- instructions like
this cannot be interrupted, and the spec allows you to not handle values of
vstart that cannot be produced by the implementation.
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* [Qemu-devel] [PATCH v2 17/17] RISC-V: add vector extension premutation instructions
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (15 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 16/17] RISC-V: add vector extension mask instructions liuzhiwei
@ 2019-09-11 6:25 ` liuzhiwei
2019-09-12 17:13 ` Richard Henderson
2019-09-11 7:00 ` [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension Aleksandar Markovic
17 siblings, 1 reply; 43+ messages in thread
From: liuzhiwei @ 2019-09-11 6:25 UTC (permalink / raw)
To: Alistair.Francis, palmer, sagark, kbastian, riku.voipio, laurent,
wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768, LIU Zhiwei
From: LIU Zhiwei <zhiwei_liu@c-sky.com>
Signed-off-by: LIU Zhiwei <zhiwei_liu@c-sky.com>
---
target/riscv/helper.h | 15 +
target/riscv/insn32.decode | 16 +
target/riscv/insn_trans/trans_rvv.inc.c | 15 +
target/riscv/vector_helper.c | 1068 +++++++++++++++++++++++++++++++
4 files changed, 1114 insertions(+)
diff --git a/target/riscv/helper.h b/target/riscv/helper.h
index 337ac2e..2d153ce 100644
--- a/target/riscv/helper.h
+++ b/target/riscv/helper.h
@@ -417,5 +417,20 @@ DEF_HELPER_3(vector_vid_v, void, env, i32, i32)
DEF_HELPER_4(vector_vmpopc_m, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vmfirst_m, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vext_x_v, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vmv_s_x, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfmv_f_s, void, env, i32, i32, i32)
+DEF_HELPER_4(vector_vfmv_s_f, void, env, i32, i32, i32)
+DEF_HELPER_5(vector_vslideup_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vslideup_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vslide1up_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vslidedown_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vslidedown_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vslide1down_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vrgather_vv, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vrgather_vx, void, env, i32, i32, i32, i32)
+DEF_HELPER_5(vector_vrgather_vi, void, env, i32, i32, i32, i32)
+DEF_HELPER_4(vector_vcompress_vm, void, env, i32, i32, i32)
+
DEF_HELPER_4(vector_vsetvli, void, env, i32, i32, i32)
DEF_HELPER_4(vector_vsetvl, void, env, i32, i32, i32)
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index 1de776b..c98915b 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -558,5 +558,21 @@ vmsif_m 010110 . ..... 00011 010 ..... 1010111 @r2_vm
viota_m 010110 . ..... 10000 010 ..... 1010111 @r2_vm
vid_v 010110 . 00000 10001 010 ..... 1010111 @r1_vm
+vext_x_v 001100 1 ..... ..... 010 ..... 1010111 @r
+vmv_s_x 001101 1 ..... ..... 110 ..... 1010111 @r
+vfmv_f_s 001100 1 ..... ..... 001 ..... 1010111 @r
+vfmv_s_f 001101 1 ..... ..... 101 ..... 1010111 @r
+vslideup_vx 001110 . ..... ..... 100 ..... 1010111 @r_vm
+vslideup_vi 001110 . ..... ..... 011 ..... 1010111 @r_vm
+vslide1up_vx 001110 . ..... ..... 110 ..... 1010111 @r_vm
+vslidedown_vx 001111 . ..... ..... 100 ..... 1010111 @r_vm
+vslidedown_vi 001111 . ..... ..... 011 ..... 1010111 @r_vm
+vslide1down_vx 001111 . ..... ..... 110 ..... 1010111 @r_vm
+vrgather_vv 001100 . ..... ..... 000 ..... 1010111 @r_vm
+vrgather_vx 001100 . ..... ..... 100 ..... 1010111 @r_vm
+vrgather_vi 001100 . ..... ..... 011 ..... 1010111 @r_vm
+vcompress_vm 010111 - ..... ..... 010 ..... 1010111 @r
+
+
vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm
vsetvl 1000000 ..... ..... 111 ..... 1010111 @r
diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c
index 85e435a..1774d1f 100644
--- a/target/riscv/insn_trans/trans_rvv.inc.c
+++ b/target/riscv/insn_trans/trans_rvv.inc.c
@@ -471,5 +471,20 @@ GEN_VECTOR_R2_VM(vmsif_m)
GEN_VECTOR_R2_VM(viota_m)
GEN_VECTOR_R1_VM(vid_v)
+GEN_VECTOR_R(vmv_s_x)
+GEN_VECTOR_R(vfmv_f_s)
+GEN_VECTOR_R(vfmv_s_f)
+GEN_VECTOR_R(vext_x_v)
+GEN_VECTOR_R_VM(vslideup_vx)
+GEN_VECTOR_R_VM(vslideup_vi)
+GEN_VECTOR_R_VM(vslide1up_vx)
+GEN_VECTOR_R_VM(vslidedown_vx)
+GEN_VECTOR_R_VM(vslidedown_vi)
+GEN_VECTOR_R_VM(vslide1down_vx)
+GEN_VECTOR_R_VM(vrgather_vv)
+GEN_VECTOR_R_VM(vrgather_vx)
+GEN_VECTOR_R_VM(vrgather_vi)
+GEN_VECTOR_R(vcompress_vm)
+
GEN_VECTOR_R2_ZIMM(vsetvli)
GEN_VECTOR_R(vsetvl)
diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c
index 9e15df9..0a25996 100644
--- a/target/riscv/vector_helper.c
+++ b/target/riscv/vector_helper.c
@@ -1010,6 +1010,26 @@ static inline bool vector_overlap_dstgp_srcgp(int rd, int dlen, int rs,
return false;
}
+/* fetch unsigned element by width */
+static inline uint64_t vector_get_iu_elem(CPURISCVState *env, uint32_t width,
+ uint32_t rs2, uint32_t index)
+{
+ uint64_t elem;
+ if (width == 8) {
+ elem = env->vfp.vreg[rs2].u8[index];
+ } else if (width == 16) {
+ elem = env->vfp.vreg[rs2].u16[index];
+ } else if (width == 32) {
+ elem = env->vfp.vreg[rs2].u32[index];
+ } else if (width == 64) {
+ elem = env->vfp.vreg[rs2].u64[index];
+ } else { /* the max of (XLEN, FLEN) is no bigger than 64 */
+ helper_raise_exception(env, RISCV_EXCP_ILLEGAL_INST);
+ return 0;
+ }
+ return elem;
+}
+
static inline void vector_get_layout(CPURISCVState *env, int width, int lmul,
int index, int *idx, int *pos)
{
@@ -24631,3 +24651,1051 @@ void VECTOR_HELPER(vid_v)(CPURISCVState *env, uint32_t vm, uint32_t rd)
env->vfp.vstart = 0;
return;
}
+
+/* vfmv.f.s rd, vs2 # rd = vs2[0] (rs1=0) */
+void VECTOR_HELPER(vfmv_f_s)(CPURISCVState *env, uint32_t rs1, uint32_t rs2,
+ uint32_t rd)
+{
+ int width, flen;
+ uint64_t mask;
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ if (env->misa & RVD) {
+ flen = 8;
+ } else if (env->misa & RVF) {
+ flen = 4;
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ width = vector_get_width(env);
+ mask = (~((uint64_t)0)) << width;
+
+ if (width == 8) {
+ env->fpr[rd] = (uint64_t)env->vfp.vreg[rs2].s8[0] | mask;
+ } else if (width == 16) {
+ env->fpr[rd] = (uint64_t)env->vfp.vreg[rs2].s16[0] | mask;
+ } else if (width == 32) {
+ env->fpr[rd] = (uint64_t)env->vfp.vreg[rs2].s32[0] | mask;
+ } else if (width == 64) {
+ if (flen == 4) {
+ env->fpr[rd] = env->vfp.vreg[rs2].s64[0] & 0xffffffff;
+ } else {
+ env->fpr[rd] = env->vfp.vreg[rs2].s64[0];
+ }
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vmv.s.x vd, rs1 # vd[0] = rs1 */
+void VECTOR_HELPER(vmv_s_x)(CPURISCVState *env, uint32_t rs1, uint32_t rs2,
+ uint32_t rd)
+{
+ int width;
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ if (env->vfp.vstart >= env->vfp.vl) {
+ return;
+ }
+
+ memset(&env->vfp.vreg[rd].u8[0], 0, VLEN / 8);
+ width = vector_get_width(env);
+
+ if (width == 8) {
+ env->vfp.vreg[rd].u8[0] = env->gpr[rs1];
+ } else if (width == 16) {
+ env->vfp.vreg[rd].u16[0] = env->gpr[rs1];
+ } else if (width == 32) {
+ env->vfp.vreg[rd].u32[0] = env->gpr[rs1];
+ } else if (width == 64) {
+ env->vfp.vreg[rd].u64[0] = env->gpr[rs1];
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vfmv.s.f vd, rs1 # vd[0] = rs1 (vs2 = 0) */
+void VECTOR_HELPER(vfmv_s_f)(CPURISCVState *env, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, flen;
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ if (env->vfp.vstart >= env->vfp.vl) {
+ return;
+ }
+ if (env->misa & RVD) {
+ flen = 8;
+ } else if (env->misa & RVF) {
+ flen = 4;
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ width = vector_get_width(env);
+
+ if (width == 8) {
+ env->vfp.vreg[rd].u8[0] = env->fpr[rs1];
+ } else if (width == 16) {
+ env->vfp.vreg[rd].u16[0] = env->fpr[rs1];
+ } else if (width == 32) {
+ env->vfp.vreg[rd].u32[0] = env->fpr[rs1];
+ } else if (width == 64) {
+ if (flen == 4) { /* 1-extended to FLEN bits */
+ env->vfp.vreg[rd].u64[0] = (uint64_t)env->fpr[rs1]
+ | 0xffffffff00000000;
+ } else {
+ env->vfp.vreg[rd].u64[0] = env->fpr[rs1];
+ }
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vslideup.vx vd, vs2, rs1, vm # vd[i+rs1] = vs2[i] */
+void VECTOR_HELPER(vslideup_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax, offset;
+ int i, j, dest, src, k;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ offset = env->gpr[rs1];
+
+ if (offset < env->vfp.vstart) {
+ offset = env->vfp.vstart;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src = rs2 + ((i - offset) / (VLEN / width));
+ j = i % (VLEN / width);
+ k = (i - offset) % (VLEN / width);
+ if (i < offset) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src].u8[k];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src].u16[k];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src].u32[k];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src].u64[k];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vslideup.vi vd, vs2, rs1, vm # vd[i+rs1] = vs2[i] */
+void VECTOR_HELPER(vslideup_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax, offset;
+ int i, j, dest, src, k;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ offset = rs1;
+
+ if (offset < env->vfp.vstart) {
+ offset = env->vfp.vstart;
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src = rs2 + ((i - offset) / (VLEN / width));
+ j = i % (VLEN / width);
+ k = (i - offset) % (VLEN / width);
+ if (i < offset) {
+ continue;
+ } else if (i < vl) {
+ if (width == 8) {
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src].u8[k];
+ }
+ } else if (width == 16) {
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src].u16[k];
+ }
+ } else if (width == 32) {
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src].u32[k];
+ }
+ } else if (width == 64) {
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src].u64[k];
+ }
+ } else {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vslide1up.vx vd, vs2, rs1, vm # vd[0]=x[rs1], vd[i+1] = vs2[i] */
+void VECTOR_HELPER(vslide1up_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src, k;
+ uint64_t s1;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ s1 = env->gpr[rs1];
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src = rs2 + ((i - 1) / (VLEN / width));
+ j = i % (VLEN / width);
+ k = (i - 1) % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i == 0 && env->vfp.vstart == 0) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = s1;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = s1;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = s1;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = s1;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src].u8[k];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src].u16[k];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src].u32[k];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src].u64[k];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vslidedown.vx vd, vs2, rs1, vm # vd[i] = vs2[i + rs1] */
+void VECTOR_HELPER(vslidedown_vx)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax, offset;
+ int i, j, dest, src, k;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_force(vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ offset = env->gpr[rs1];
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src = rs2 + ((i + offset) / (VLEN / width));
+ j = i % (VLEN / width);
+ k = (i + offset) % (VLEN / width);
+ if (i < offset) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (i + offset < vlmax) {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src].u8[k];
+ } else {
+ env->vfp.vreg[dest].u8[j] = 0;
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (i + offset < vlmax) {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src].u16[k];
+ } else {
+ env->vfp.vreg[dest].u16[j] = 0;
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (i + offset < vlmax) {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src].u32[k];
+ } else {
+ env->vfp.vreg[dest].u32[j] = 0;
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (i + offset < vlmax) {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src].u64[k];
+ } else {
+ env->vfp.vreg[dest].u64[j] = 0;
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+void VECTOR_HELPER(vslidedown_vi)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax, offset;
+ int i, j, dest, src, k;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_force(vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ offset = rs1;
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src = rs2 + ((i + offset) / (VLEN / width));
+ j = i % (VLEN / width);
+ k = (i + offset) % (VLEN / width);
+ if (i < offset) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (i + offset < vlmax) {
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src].u8[k];
+ } else {
+ env->vfp.vreg[dest].u8[j] = 0;
+ }
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (i + offset < vlmax) {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src].u16[k];
+ } else {
+ env->vfp.vreg[dest].u16[j] = 0;
+ }
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (i + offset < vlmax) {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src].u32[k];
+ } else {
+ env->vfp.vreg[dest].u32[j] = 0;
+ }
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (i + offset < vlmax) {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src].u64[k];
+ } else {
+ env->vfp.vreg[dest].u64[j] = 0;
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vslide1down.vx vd, vs2, rs1, vm # vd[vl - 1]=x[rs1], vd[i] = vs2[i + 1] */
+void VECTOR_HELPER(vslide1down_vx)(CPURISCVState *env, uint32_t vm,
+ uint32_t rs1, uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src, k;
+ uint64_t s1;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env) || vector_overlap_vm_force(vm, rd)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+ s1 = env->gpr[rs1];
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src = rs2 + ((i + 1) / (VLEN / width));
+ j = i % (VLEN / width);
+ k = (i + 1) % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i == vl - 1 && i >= env->vfp.vstart) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = s1;
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] = s1;
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] = s1;
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] = s1;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else if (i < vl - 1) {
+ switch (width) {
+ case 8:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[j] = env->vfp.vreg[src].u8[k];
+ }
+ break;
+ case 16:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src].u16[k];
+ }
+ break;
+ case 32:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src].u32[k];
+ }
+ break;
+ case 64:
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src].u64[k];
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/*
+ * vcompress.vm vd, vs2, vs1
+ * Compress into vd elements of vs2 where vs1 is enabled
+ */
+void VECTOR_HELPER(vcompress_vm)(CPURISCVState *env, uint32_t rs1, uint32_t rs2,
+ uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src;
+ uint32_t vd_idx, num = 0;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+ if (vector_vtype_ill(env)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs1, 1)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ if (env->vfp.vstart != 0) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ /* zeroed all elements */
+ for (i = 0; i < lmul; i++) {
+ memset(&env->vfp.vreg[rd + i].u64[0], 0, VLEN / 8);
+ }
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (num / (VLEN / width));
+ src = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ vd_idx = num % (VLEN / width);
+ if (i < vl) {
+ switch (width) {
+ case 8:
+ if (vector_mask_reg(env, rs1, width, lmul, i)) {
+ env->vfp.vreg[dest].u8[vd_idx] =
+ env->vfp.vreg[src].u8[j];
+ num++;
+ }
+ break;
+ case 16:
+ if (vector_mask_reg(env, rs1, width, lmul, i)) {
+ env->vfp.vreg[dest].u16[vd_idx] =
+ env->vfp.vreg[src].u16[j];
+ num++;
+ }
+ break;
+ case 32:
+ if (vector_mask_reg(env, rs1, width, lmul, i)) {
+ env->vfp.vreg[dest].u32[vd_idx] =
+ env->vfp.vreg[src].u32[j];
+ num++;
+ }
+ break;
+ case 64:
+ if (vector_mask_reg(env, rs1, width, lmul, i)) {
+ env->vfp.vreg[dest].u64[vd_idx] =
+ env->vfp.vreg[src].u64[j];
+ num++;
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+void VECTOR_HELPER(vext_x_v)(CPURISCVState *env, uint32_t rs1, uint32_t rs2,
+ uint32_t rd)
+{
+ int width;
+ uint64_t elem;
+ target_ulong index = env->gpr[rs1];
+
+ if (vector_vtype_ill(env)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ width = vector_get_width(env);
+
+ elem = vector_get_iu_elem(env, width, rs2, index);
+ if (index >= VLEN / width) { /* index is too big */
+ env->gpr[rd] = 0;
+ } else {
+ env->gpr[rd] = elem;
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/*
+ * vrgather.vv vd, vs2, vs1, vm #
+ * vd[i] = (vs1[i] >= VLMAX) ? 0 : vs2[vs1[i]];
+ */
+void VECTOR_HELPER(vrgather_vv)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src, src1;
+ uint32_t index;
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs1, lmul)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ vector_lmul_check_reg(env, lmul, rs1, false);
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src1 = rs1 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ index = env->vfp.vreg[src1].u8[j];
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (index >= vlmax) {
+ env->vfp.vreg[dest].u8[j] = 0;
+ } else {
+ src = rs2 + (index / (VLEN / width));
+ index = index % (VLEN / width);
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src].u8[index];
+ }
+ }
+ break;
+ case 16:
+ index = env->vfp.vreg[src1].u16[j];
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (index >= vlmax) {
+ env->vfp.vreg[dest].u16[j] = 0;
+ } else {
+ src = rs2 + (index / (VLEN / width));
+ index = index % (VLEN / width);
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src].u16[index];
+ }
+ }
+ break;
+ case 32:
+ index = env->vfp.vreg[src1].u32[j];
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (index >= vlmax) {
+ env->vfp.vreg[dest].u32[j] = 0;
+ } else {
+ src = rs2 + (index / (VLEN / width));
+ index = index % (VLEN / width);
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src].u32[index];
+ }
+ }
+ break;
+ case 64:
+ index = env->vfp.vreg[src1].u64[j];
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (index >= vlmax) {
+ env->vfp.vreg[dest].u64[j] = 0;
+ } else {
+ src = rs2 + (index / (VLEN / width));
+ index = index % (VLEN / width);
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src].u64[index];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vrgather.vx vd, vs2, rs1, vm # vd[i] = (x[rs1] >= VLMAX) ? 0 : vs2[rs1] */
+void VECTOR_HELPER(vrgather_vx)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src;
+ uint32_t index;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ index = env->gpr[rs1];
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (index >= vlmax) {
+ env->vfp.vreg[dest].u8[j] = 0;
+ } else {
+ src = rs2 + (index / (VLEN / width));
+ index = index % (VLEN / width);
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src].u8[index];
+ }
+ }
+ break;
+ case 16:
+ index = env->gpr[rs1];
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (index >= vlmax) {
+ env->vfp.vreg[dest].u16[j] = 0;
+ } else {
+ src = rs2 + (index / (VLEN / width));
+ index = index % (VLEN / width);
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src].u16[index];
+ }
+ }
+ break;
+ case 32:
+ index = env->gpr[rs1];
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (index >= vlmax) {
+ env->vfp.vreg[dest].u32[j] = 0;
+ } else {
+ src = rs2 + (index / (VLEN / width));
+ index = index % (VLEN / width);
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src].u32[index];
+ }
+ }
+ break;
+ case 64:
+ index = env->gpr[rs1];
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (index >= vlmax) {
+ env->vfp.vreg[dest].u64[j] = 0;
+ } else {
+ src = rs2 + (index / (VLEN / width));
+ index = index % (VLEN / width);
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src].u64[index];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
+/* vrgather.vi vd, vs2, imm, vm # vd[i] = (imm >= VLMAX) ? 0 : vs2[imm] */
+void VECTOR_HELPER(vrgather_vi)(CPURISCVState *env, uint32_t vm, uint32_t rs1,
+ uint32_t rs2, uint32_t rd)
+{
+ int width, lmul, vl, vlmax;
+ int i, j, dest, src;
+ uint32_t index;
+
+ lmul = vector_get_lmul(env);
+ vl = env->vfp.vl;
+
+ if (vector_vtype_ill(env)
+ || vector_overlap_vm_force(vm, rd)
+ || vector_overlap_dstgp_srcgp(rd, lmul, rs2, lmul)) {
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ vector_lmul_check_reg(env, lmul, rs2, false);
+ vector_lmul_check_reg(env, lmul, rd, false);
+
+ if (env->vfp.vstart >= vl) {
+ return;
+ }
+
+ width = vector_get_width(env);
+ vlmax = vector_get_vlmax(env);
+
+ for (i = 0; i < vlmax; i++) {
+ dest = rd + (i / (VLEN / width));
+ src = rs2 + (i / (VLEN / width));
+ j = i % (VLEN / width);
+ if (i < env->vfp.vstart) {
+ continue;
+ } else if (i < vl) {
+ switch (width) {
+ case 8:
+ index = rs1;
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (index >= vlmax) {
+ env->vfp.vreg[dest].u8[j] = 0;
+ } else {
+ src = rs2 + (index / (VLEN / width));
+ index = index % (VLEN / width);
+ env->vfp.vreg[dest].u8[j] =
+ env->vfp.vreg[src].u8[index];
+ }
+ }
+ break;
+ case 16:
+ index = rs1;
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (index >= vlmax) {
+ env->vfp.vreg[dest].u16[j] = 0;
+ } else {
+ src = rs2 + (index / (VLEN / width));
+ index = index % (VLEN / width);
+ env->vfp.vreg[dest].u16[j] =
+ env->vfp.vreg[src].u16[index];
+ }
+ }
+ break;
+ case 32:
+ index = rs1;
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (index >= vlmax) {
+ env->vfp.vreg[dest].u32[j] = 0;
+ } else {
+ src = rs2 + (index / (VLEN / width));
+ index = index % (VLEN / width);
+ env->vfp.vreg[dest].u32[j] =
+ env->vfp.vreg[src].u32[index];
+ }
+ }
+ break;
+ case 64:
+ index = rs1;
+ if (vector_elem_mask(env, vm, width, lmul, i)) {
+ if (index >= vlmax) {
+ env->vfp.vreg[dest].u64[j] = 0;
+ } else {
+ src = rs2 + (index / (VLEN / width));
+ index = index % (VLEN / width);
+ env->vfp.vreg[dest].u64[j] =
+ env->vfp.vreg[src].u64[index];
+ }
+ }
+ break;
+ default:
+ riscv_raise_exception(env, RISCV_EXCP_ILLEGAL_INST, GETPC());
+ return;
+ }
+ } else {
+ vector_tail_common(env, dest, j, width);
+ }
+ }
+ env->vfp.vstart = 0;
+ return;
+}
+
--
2.7.4
^ permalink raw reply related [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 17/17] RISC-V: add vector extension premutation instructions
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 17/17] RISC-V: add vector extension premutation instructions liuzhiwei
@ 2019-09-12 17:13 ` Richard Henderson
0 siblings, 0 replies; 43+ messages in thread
From: Richard Henderson @ 2019-09-12 17:13 UTC (permalink / raw)
To: liuzhiwei, Alistair.Francis, palmer, sagark, kbastian,
riku.voipio, laurent, wenmeng_zhang
Cc: qemu-riscv, qemu-devel, wxy194768
On 9/11/19 2:25 AM, liuzhiwei wrote:
> +/* vfmv.f.s rd, vs2 # rd = vs2[0] (rs1=0) */
> +void VECTOR_HELPER(vfmv_f_s)(CPURISCVState *env, uint32_t rs1, uint32_t rs2,
> + uint32_t rd)
...
> +/* vmv.s.x vd, rs1 # vd[0] = rs1 */
> +void VECTOR_HELPER(vmv_s_x)(CPURISCVState *env, uint32_t rs1, uint32_t rs2,
> + uint32_t rd)
...
> +/* vfmv.s.f vd, rs1 # vd[0] = rs1 (vs2 = 0) */
> +void VECTOR_HELPER(vfmv_s_f)(CPURISCVState *env, uint32_t rs1,
> + uint32_t rs2, uint32_t rd)
I'll note that, with the vector parameters known to the translator, as I have
advocated, these operations are trivially expanded inline as one or two tcg
operations.
r~
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension
2019-09-11 6:25 [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension liuzhiwei
` (16 preceding siblings ...)
2019-09-11 6:25 ` [Qemu-devel] [PATCH v2 17/17] RISC-V: add vector extension premutation instructions liuzhiwei
@ 2019-09-11 7:00 ` Aleksandar Markovic
2019-09-14 12:59 ` Palmer Dabbelt
17 siblings, 1 reply; 43+ messages in thread
From: Aleksandar Markovic @ 2019-09-11 7:00 UTC (permalink / raw)
To: liuzhiwei
Cc: riku.voipio, qemu-riscv, sagark, kbastian, palmer, qemu-devel,
wxy194768, laurent, wenmeng_zhang, Alistair.Francis
11.09.2019. 08.35, "liuzhiwei" <zhiwei_liu@c-sky.com> је написао/ла:
>
> Features:
> * support specification riscv-v-spec-0.7.1(
https://content.riscv.org/wp-content/uploads/2019/06/17.40-Vector_RISCV-20190611-Vectors.pdf
).
Hi, Zhivei.
The linked document is a presentation, outlining general concepts of the
instruction set in question, which is certainly useful and nice to have,
but, for review process, we need *specifications* (especially given that
they are in draft phase, and therefore "moving target"). Please provide
such link.
I also noticed lack of commit messages, and was really disappointed by
that. It looks to me you did not honor in entirety our guidlines for
submitting patches.
Yours,
Aleksandar
> * support basic vector extension.
> * support Zvlsseg.
> * support Zvamo.
> * not support Zvediv as it is changing.
> * fixed VLEN 128bit.
> * fixed SLEN 128bit.
> * ELEN support 8bit, 16bit, 32bit, 64bit.
>
> Todo:
> * support VLEN configure from qemu command line.
> * move check code from execution-time to translation-time
>
> Changelog:
> V2
> * use float16_compare{_quiet}
> * only use GETPC() in outer most helper
> * add ctx.ext_v Property
>
>
> LIU Zhiwei (17):
> RISC-V: add vfp field in CPURISCVState
> RISC-V: turn on vector extension from command line by cfg.ext_v
> Property
> RISC-V: support vector extension csr
> RISC-V: add vector extension configure instruction
> RISC-V: add vector extension load and store instructions
> RISC-V: add vector extension fault-only-first implementation
> RISC-V: add vector extension atomic instructions
> RISC-V: add vector extension integer instructions part1,
> add/sub/adc/sbc
> RISC-V: add vector extension integer instructions part2, bit/shift
> RISC-V: add vector extension integer instructions part3, cmp/min/max
> RISC-V: add vector extension integer instructions part4, mul/div/merge
> RISC-V: add vector extension fixed point instructions
> RISC-V: add vector extension float instruction part1, add/sub/mul/div
> RISC-V: add vector extension float instructions part2,
> sqrt/cmp/cvt/others
> RISC-V: add vector extension reduction instructions
> RISC-V: add vector extension mask instructions
> RISC-V: add vector extension premutation instructions
>
> linux-user/riscv/cpu_loop.c | 7 +
> target/riscv/Makefile.objs | 2 +-
> target/riscv/cpu.c | 6 +-
> target/riscv/cpu.h | 30 +
> target/riscv/cpu_bits.h | 15 +
> target/riscv/cpu_helper.c | 7 +
> target/riscv/csr.c | 65 +-
> target/riscv/helper.h | 358 +
> target/riscv/insn32.decode | 373 +
> target/riscv/insn_trans/trans_rvv.inc.c | 490 +
> target/riscv/translate.c | 1 +
> target/riscv/vector_helper.c | 25701
++++++++++++++++++++++++++++++
> 12 files changed, 27049 insertions(+), 6 deletions(-)
> create mode 100644 target/riscv/insn_trans/trans_rvv.inc.c
> create mode 100644 target/riscv/vector_helper.c
>
> --
> 2.7.4
>
>
^ permalink raw reply [flat|nested] 43+ messages in thread
* Re: [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension
2019-09-11 7:00 ` [Qemu-devel] [PATCH v2 00/17] RISC-V: support vector extension Aleksandar Markovic
@ 2019-09-14 12:59 ` Palmer Dabbelt
0 siblings, 0 replies; 43+ messages in thread
From: Palmer Dabbelt @ 2019-09-14 12:59 UTC (permalink / raw)
To: aleksandar.m.mail
Cc: qemu-riscv, sagark, Bastian Koppelmann, riku.voipio, qemu-devel,
wxy194768, laurent, wenmeng_zhang, Alistair Francis, zhiwei_liu
On Wed, 11 Sep 2019 00:00:56 PDT (-0700), aleksandar.m.mail@gmail.com wrote:
> 11.09.2019. 08.35, "liuzhiwei" <zhiwei_liu@c-sky.com> је написао/ла:
>>
>> Features:
>> * support specification riscv-v-spec-0.7.1(
> https://content.riscv.org/wp-content/uploads/2019/06/17.40-Vector_RISCV-20190611-Vectors.pdf
> ).
>
> Hi, Zhivei.
>
> The linked document is a presentation, outlining general concepts of the
> instruction set in question, which is certainly useful and nice to have,
> but, for review process, we need *specifications* (especially given that
> they are in draft phase, and therefore "moving target"). Please provide
> such link.
Here's the V spec repository
https://github.com/riscv/riscv-v-spec
and the exact 0.7.1 specification PDF
https://github.com/riscv/riscv-v-spec/releases/download/0.7.1/riscv-v-spec-0.7.1.pdf
In RISC-V land this constitutes an official draft -- there's a whole process
for getting a specification ratified, but that isn't done for these draft
specifications. The RISC-V QEMU maintainers agree that we'll take
implementations of drafts as long as there's a concrete definition we can point
at, like this one.
> I also noticed lack of commit messages, and was really disappointed by
> that. It looks to me you did not honor in entirety our guidlines for
> submitting patches.
>
> Yours,
> Aleksandar
>
>> * support basic vector extension.
>
>> * support Zvlsseg.
>
>> * support Zvamo.
>
>> * not support Zvediv as it is changing.
>> * fixed VLEN 128bit.
>> * fixed SLEN 128bit.
>> * ELEN support 8bit, 16bit, 32bit, 64bit.
>>
>> Todo:
>> * support VLEN configure from qemu command line.
>> * move check code from execution-time to translation-time
>>
>> Changelog:
>> V2
>> * use float16_compare{_quiet}
>> * only use GETPC() in outer most helper
>> * add ctx.ext_v Property
>>
>>
>> LIU Zhiwei (17):
>> RISC-V: add vfp field in CPURISCVState
>> RISC-V: turn on vector extension from command line by cfg.ext_v
>> Property
>> RISC-V: support vector extension csr
>> RISC-V: add vector extension configure instruction
>> RISC-V: add vector extension load and store instructions
>> RISC-V: add vector extension fault-only-first implementation
>> RISC-V: add vector extension atomic instructions
>> RISC-V: add vector extension integer instructions part1,
>> add/sub/adc/sbc
>> RISC-V: add vector extension integer instructions part2, bit/shift
>> RISC-V: add vector extension integer instructions part3, cmp/min/max
>> RISC-V: add vector extension integer instructions part4, mul/div/merge
>> RISC-V: add vector extension fixed point instructions
>> RISC-V: add vector extension float instruction part1, add/sub/mul/div
>> RISC-V: add vector extension float instructions part2,
>> sqrt/cmp/cvt/others
>> RISC-V: add vector extension reduction instructions
>> RISC-V: add vector extension mask instructions
>> RISC-V: add vector extension premutation instructions
>>
>> linux-user/riscv/cpu_loop.c | 7 +
>> target/riscv/Makefile.objs | 2 +-
>> target/riscv/cpu.c | 6 +-
>> target/riscv/cpu.h | 30 +
>> target/riscv/cpu_bits.h | 15 +
>> target/riscv/cpu_helper.c | 7 +
>> target/riscv/csr.c | 65 +-
>> target/riscv/helper.h | 358 +
>> target/riscv/insn32.decode | 373 +
>> target/riscv/insn_trans/trans_rvv.inc.c | 490 +
>> target/riscv/translate.c | 1 +
>> target/riscv/vector_helper.c | 25701
> ++++++++++++++++++++++++++++++
>> 12 files changed, 27049 insertions(+), 6 deletions(-)
>> create mode 100644 target/riscv/insn_trans/trans_rvv.inc.c
>> create mode 100644 target/riscv/vector_helper.c
>>
>> --
>> 2.7.4
>>
>>
^ permalink raw reply [flat|nested] 43+ messages in thread