* [PATCH v3 0/6] target/arm: assorted mte fixes
@ 2024-02-07 2:52 Richard Henderson
2024-02-07 2:52 ` [PATCH v3 1/6] linux-user/aarch64: Choose SYNC as the preferred MTE mode Richard Henderson
` (7 more replies)
0 siblings, 8 replies; 14+ messages in thread
From: Richard Henderson @ 2024-02-07 2:52 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, gustavo.romero
Changes for v3:
- As if /sys/devices/system/cpu/cpu<N>/mte_tcf_preferred is "sync".
- Fix do_st_zpa as well as do_ld_zpa. Oops.
Because of the above, I dropped Gustavo's t-b.
r~
Richard Henderson (6):
linux-user/aarch64: Choose SYNC as the preferred MTE mode
target/arm: Fix nregs computation in do_{ld,st}_zpa
target/arm: Adjust and validate mtedesc sizem1
target/arm: Split out make_svemte_desc
target/arm: Handle mte in do_ldrq, do_ldro
target/arm: Fix SVE/SME gross MTE suppression checks
linux-user/aarch64/target_prctl.h | 29 ++++++-----
target/arm/internals.h | 2 +-
target/arm/tcg/translate-a64.h | 2 +
target/arm/tcg/sme_helper.c | 8 +--
target/arm/tcg/sve_helper.c | 12 ++---
target/arm/tcg/translate-sme.c | 15 ++----
target/arm/tcg/translate-sve.c | 83 ++++++++++++++++++-------------
7 files changed, 83 insertions(+), 68 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v3 1/6] linux-user/aarch64: Choose SYNC as the preferred MTE mode
2024-02-07 2:52 [PATCH v3 0/6] target/arm: assorted mte fixes Richard Henderson
@ 2024-02-07 2:52 ` Richard Henderson
2024-02-07 20:03 ` Gustavo Romero
2024-02-07 2:52 ` [PATCH v3 2/6] target/arm: Fix nregs computation in do_{ld,st}_zpa Richard Henderson
` (6 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Richard Henderson @ 2024-02-07 2:52 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, gustavo.romero, qemu-stable
The API does not generate an error for setting ASYNC | SYNC; that merely
constrains the selection vs the per-cpu default. For qemu linux-user,
choose SYNC as the default.
Cc: qemu-stable@nongnu.org
Reported-by: Gustavo Romero <gustavo.romero@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
linux-user/aarch64/target_prctl.h | 29 +++++++++++++++++------------
1 file changed, 17 insertions(+), 12 deletions(-)
diff --git a/linux-user/aarch64/target_prctl.h b/linux-user/aarch64/target_prctl.h
index 5067e7d731..aa8e203c15 100644
--- a/linux-user/aarch64/target_prctl.h
+++ b/linux-user/aarch64/target_prctl.h
@@ -173,21 +173,26 @@ static abi_long do_prctl_set_tagged_addr_ctrl(CPUArchState *env, abi_long arg2)
env->tagged_addr_enable = arg2 & PR_TAGGED_ADDR_ENABLE;
if (cpu_isar_feature(aa64_mte, cpu)) {
- switch (arg2 & PR_MTE_TCF_MASK) {
- case PR_MTE_TCF_NONE:
- case PR_MTE_TCF_SYNC:
- case PR_MTE_TCF_ASYNC:
- break;
- default:
- return -EINVAL;
- }
-
/*
* Write PR_MTE_TCF to SCTLR_EL1[TCF0].
- * Note that the syscall values are consistent with hw.
+ *
+ * The kernel has a per-cpu configuration for the sysadmin,
+ * /sys/devices/system/cpu/cpu<N>/mte_tcf_preferred,
+ * which qemu does not implement.
+ *
+ * Because there is no performance difference between the modes, and
+ * because SYNC is most useful for debugging MTE errors, choose SYNC
+ * as the preferred mode. With this preference, and the way the API
+ * uses only two bits, there is no way for the program to select
+ * ASYMM mode.
*/
- env->cp15.sctlr_el[1] =
- deposit64(env->cp15.sctlr_el[1], 38, 2, arg2 >> PR_MTE_TCF_SHIFT);
+ unsigned tcf = 0;
+ if (arg2 & PR_MTE_TCF_SYNC) {
+ tcf = 1;
+ } else if (arg2 & PR_MTE_TCF_ASYNC) {
+ tcf = 2;
+ }
+ env->cp15.sctlr_el[1] = deposit64(env->cp15.sctlr_el[1], 38, 2, tcf);
/*
* Write PR_MTE_TAG to GCR_EL1[Exclude].
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 2/6] target/arm: Fix nregs computation in do_{ld,st}_zpa
2024-02-07 2:52 [PATCH v3 0/6] target/arm: assorted mte fixes Richard Henderson
2024-02-07 2:52 ` [PATCH v3 1/6] linux-user/aarch64: Choose SYNC as the preferred MTE mode Richard Henderson
@ 2024-02-07 2:52 ` Richard Henderson
2024-02-08 16:24 ` [PATCH v3 2/6] target/arm: Fix nregs computation in do_{ld, st}_zpa Peter Maydell
2024-02-07 2:52 ` [PATCH v3 3/6] target/arm: Adjust and validate mtedesc sizem1 Richard Henderson
` (5 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Richard Henderson @ 2024-02-07 2:52 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, gustavo.romero, qemu-stable
The field is encoded as [0-3], which is convenient for
indexing our array of function pointers, but the true
value is [1-4]. Adjust before calling do_mem_zpa.
Add an assert, and move the comment re passing ZT to
the helper back next to the relevant code.
Cc: qemu-stable@nongnu.org
Fixes: 206adacfb8d ("target/arm: Add mte helpers for sve scalar + int loads")
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/tcg/translate-sve.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 296e7d1ce2..7108938251 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4445,11 +4445,7 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
TCGv_ptr t_pg;
int desc = 0;
- /*
- * For e.g. LD4, there are not enough arguments to pass all 4
- * registers as pointers, so encode the regno into the data field.
- * For consistency, do this even for LD1.
- */
+ assert(mte_n >= 1 && mte_n <= 4);
if (s->mte_active[0]) {
int msz = dtype_msz(dtype);
@@ -4463,6 +4459,11 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
addr = clean_data_tbi(s, addr);
}
+ /*
+ * For e.g. LD4, there are not enough arguments to pass all 4
+ * registers as pointers, so encode the regno into the data field.
+ * For consistency, do this even for LD1.
+ */
desc = simd_desc(vsz, vsz, zt | desc);
t_pg = tcg_temp_new_ptr();
@@ -4600,7 +4601,7 @@ static void do_ld_zpa(DisasContext *s, int zt, int pg,
* accessible via the instruction encoding.
*/
assert(fn != NULL);
- do_mem_zpa(s, zt, pg, addr, dtype, nreg, false, fn);
+ do_mem_zpa(s, zt, pg, addr, dtype, nreg + 1, false, fn);
}
static bool trans_LD_zprr(DisasContext *s, arg_rprr_load *a)
@@ -5168,14 +5169,13 @@ static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
if (nreg == 0) {
/* ST1 */
fn = fn_single[s->mte_active[0]][be][msz][esz];
- nreg = 1;
} else {
/* ST2, ST3, ST4 -- msz == esz, enforced by encoding */
assert(msz == esz);
fn = fn_multiple[s->mte_active[0]][be][nreg - 1][msz];
}
assert(fn != NULL);
- do_mem_zpa(s, zt, pg, addr, msz_dtype(s, msz), nreg, true, fn);
+ do_mem_zpa(s, zt, pg, addr, msz_dtype(s, msz), nreg + 1, true, fn);
}
static bool trans_ST_zprr(DisasContext *s, arg_rprr_store *a)
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 3/6] target/arm: Adjust and validate mtedesc sizem1
2024-02-07 2:52 [PATCH v3 0/6] target/arm: assorted mte fixes Richard Henderson
2024-02-07 2:52 ` [PATCH v3 1/6] linux-user/aarch64: Choose SYNC as the preferred MTE mode Richard Henderson
2024-02-07 2:52 ` [PATCH v3 2/6] target/arm: Fix nregs computation in do_{ld,st}_zpa Richard Henderson
@ 2024-02-07 2:52 ` Richard Henderson
2024-02-16 15:12 ` Michael Tokarev
2024-02-07 2:52 ` [PATCH v3 4/6] target/arm: Split out make_svemte_desc Richard Henderson
` (4 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Richard Henderson @ 2024-02-07 2:52 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, gustavo.romero, qemu-stable, Peter Maydell
When we added SVE_MTEDESC_SHIFT, we effectively limited the
maximum size of MTEDESC. Adjust SIZEM1 to consume the remaining
bits (32 - 10 - 5 - 12 == 5). Assert that the data to be stored
fits within the field (expecting 8 * 4 - 1 == 31, exact fit).
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/internals.h | 2 +-
target/arm/tcg/translate-sve.c | 7 ++++---
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/target/arm/internals.h b/target/arm/internals.h
index fc337fe40e..50bff44549 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -1278,7 +1278,7 @@ FIELD(MTEDESC, TBI, 4, 2)
FIELD(MTEDESC, TCMA, 6, 2)
FIELD(MTEDESC, WRITE, 8, 1)
FIELD(MTEDESC, ALIGN, 9, 3)
-FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - 12) /* size - 1 */
+FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - SVE_MTEDESC_SHIFT - 12) /* size - 1 */
bool mte_probe(CPUARMState *env, uint32_t desc, uint64_t ptr);
uint64_t mte_check(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra);
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 7108938251..a88e523cba 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4443,17 +4443,18 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
{
unsigned vsz = vec_full_reg_size(s);
TCGv_ptr t_pg;
+ uint32_t sizem1;
int desc = 0;
assert(mte_n >= 1 && mte_n <= 4);
+ sizem1 = (mte_n << dtype_msz(dtype)) - 1;
+ assert(sizem1 <= R_MTEDESC_SIZEM1_MASK >> R_MTEDESC_SIZEM1_SHIFT);
if (s->mte_active[0]) {
- int msz = dtype_msz(dtype);
-
desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
- desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (mte_n << msz) - 1);
+ desc = FIELD_DP32(desc, MTEDESC, SIZEM1, sizem1);
desc <<= SVE_MTEDESC_SHIFT;
} else {
addr = clean_data_tbi(s, addr);
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 4/6] target/arm: Split out make_svemte_desc
2024-02-07 2:52 [PATCH v3 0/6] target/arm: assorted mte fixes Richard Henderson
` (2 preceding siblings ...)
2024-02-07 2:52 ` [PATCH v3 3/6] target/arm: Adjust and validate mtedesc sizem1 Richard Henderson
@ 2024-02-07 2:52 ` Richard Henderson
2024-02-07 2:52 ` [PATCH v3 5/6] target/arm: Handle mte in do_ldrq, do_ldro Richard Henderson
` (3 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Richard Henderson @ 2024-02-07 2:52 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, gustavo.romero, qemu-stable, Peter Maydell
Share code that creates mtedesc and embeds within simd_desc.
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/tcg/translate-a64.h | 2 ++
target/arm/tcg/translate-sme.c | 15 +++--------
target/arm/tcg/translate-sve.c | 47 ++++++++++++++++++----------------
3 files changed, 31 insertions(+), 33 deletions(-)
diff --git a/target/arm/tcg/translate-a64.h b/target/arm/tcg/translate-a64.h
index 96ba39b37e..7b811b8ac5 100644
--- a/target/arm/tcg/translate-a64.h
+++ b/target/arm/tcg/translate-a64.h
@@ -28,6 +28,8 @@ bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn,
bool sve_access_check(DisasContext *s);
bool sme_enabled_check(DisasContext *s);
bool sme_enabled_check_with_svcr(DisasContext *s, unsigned);
+uint32_t make_svemte_desc(DisasContext *s, unsigned vsz, uint32_t nregs,
+ uint32_t msz, bool is_write, uint32_t data);
/* This function corresponds to CheckStreamingSVEEnabled. */
static inline bool sme_sm_enabled_check(DisasContext *s)
diff --git a/target/arm/tcg/translate-sme.c b/target/arm/tcg/translate-sme.c
index 8f0dfc884e..46c7fce8b4 100644
--- a/target/arm/tcg/translate-sme.c
+++ b/target/arm/tcg/translate-sme.c
@@ -206,7 +206,7 @@ static bool trans_LDST1(DisasContext *s, arg_LDST1 *a)
TCGv_ptr t_za, t_pg;
TCGv_i64 addr;
- int svl, desc = 0;
+ uint32_t desc;
bool be = s->be_data == MO_BE;
bool mte = s->mte_active[0];
@@ -224,18 +224,11 @@ static bool trans_LDST1(DisasContext *s, arg_LDST1 *a)
tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), a->esz);
tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn));
- if (mte) {
- desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
- desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
- desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
- desc = FIELD_DP32(desc, MTEDESC, WRITE, a->st);
- desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (1 << a->esz) - 1);
- desc <<= SVE_MTEDESC_SHIFT;
- } else {
+ if (!mte) {
addr = clean_data_tbi(s, addr);
}
- svl = streaming_vec_reg_size(s);
- desc = simd_desc(svl, svl, desc);
+
+ desc = make_svemte_desc(s, streaming_vec_reg_size(s), 1, a->esz, a->st, 0);
fns[a->esz][be][a->v][mte][a->st](tcg_env, t_za, t_pg, addr,
tcg_constant_i32(desc));
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index a88e523cba..508f7b6bbd 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4437,18 +4437,18 @@ static const uint8_t dtype_esz[16] = {
3, 2, 1, 3
};
-static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
- int dtype, uint32_t mte_n, bool is_write,
- gen_helper_gvec_mem *fn)
+uint32_t make_svemte_desc(DisasContext *s, unsigned vsz, uint32_t nregs,
+ uint32_t msz, bool is_write, uint32_t data)
{
- unsigned vsz = vec_full_reg_size(s);
- TCGv_ptr t_pg;
uint32_t sizem1;
- int desc = 0;
+ uint32_t desc = 0;
- assert(mte_n >= 1 && mte_n <= 4);
- sizem1 = (mte_n << dtype_msz(dtype)) - 1;
+ /* Assert all of the data fits, with or without MTE enabled. */
+ assert(nregs >= 1 && nregs <= 4);
+ sizem1 = (nregs << msz) - 1;
assert(sizem1 <= R_MTEDESC_SIZEM1_MASK >> R_MTEDESC_SIZEM1_SHIFT);
+ assert(data < 1u << SVE_MTEDESC_SHIFT);
+
if (s->mte_active[0]) {
desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
@@ -4456,7 +4456,18 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
desc = FIELD_DP32(desc, MTEDESC, SIZEM1, sizem1);
desc <<= SVE_MTEDESC_SHIFT;
- } else {
+ }
+ return simd_desc(vsz, vsz, desc | data);
+}
+
+static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
+ int dtype, uint32_t nregs, bool is_write,
+ gen_helper_gvec_mem *fn)
+{
+ TCGv_ptr t_pg;
+ uint32_t desc;
+
+ if (!s->mte_active[0]) {
addr = clean_data_tbi(s, addr);
}
@@ -4465,7 +4476,8 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
* registers as pointers, so encode the regno into the data field.
* For consistency, do this even for LD1.
*/
- desc = simd_desc(vsz, vsz, zt | desc);
+ desc = make_svemte_desc(s, vec_full_reg_size(s), nregs,
+ dtype_msz(dtype), is_write, zt);
t_pg = tcg_temp_new_ptr();
tcg_gen_addi_ptr(t_pg, tcg_env, pred_full_reg_offset(s, pg));
@@ -5224,25 +5236,16 @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm,
int scale, TCGv_i64 scalar, int msz, bool is_write,
gen_helper_gvec_mem_scatter *fn)
{
- unsigned vsz = vec_full_reg_size(s);
TCGv_ptr t_zm = tcg_temp_new_ptr();
TCGv_ptr t_pg = tcg_temp_new_ptr();
TCGv_ptr t_zt = tcg_temp_new_ptr();
- int desc = 0;
-
- if (s->mte_active[0]) {
- desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
- desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
- desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
- desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
- desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (1 << msz) - 1);
- desc <<= SVE_MTEDESC_SHIFT;
- }
- desc = simd_desc(vsz, vsz, desc | scale);
+ uint32_t desc;
tcg_gen_addi_ptr(t_pg, tcg_env, pred_full_reg_offset(s, pg));
tcg_gen_addi_ptr(t_zm, tcg_env, vec_full_reg_offset(s, zm));
tcg_gen_addi_ptr(t_zt, tcg_env, vec_full_reg_offset(s, zt));
+
+ desc = make_svemte_desc(s, vec_full_reg_size(s), 1, msz, is_write, scale);
fn(tcg_env, t_zt, t_pg, t_zm, scalar, tcg_constant_i32(desc));
}
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 5/6] target/arm: Handle mte in do_ldrq, do_ldro
2024-02-07 2:52 [PATCH v3 0/6] target/arm: assorted mte fixes Richard Henderson
` (3 preceding siblings ...)
2024-02-07 2:52 ` [PATCH v3 4/6] target/arm: Split out make_svemte_desc Richard Henderson
@ 2024-02-07 2:52 ` Richard Henderson
2024-02-07 2:52 ` [PATCH v3 6/6] target/arm: Fix SVE/SME gross MTE suppression checks Richard Henderson
` (2 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Richard Henderson @ 2024-02-07 2:52 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, gustavo.romero, qemu-stable, Peter Maydell
These functions "use the standard load helpers", but
fail to clean_data_tbi or populate mtedesc.
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/tcg/translate-sve.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
index 508f7b6bbd..ada05aa530 100644
--- a/target/arm/tcg/translate-sve.c
+++ b/target/arm/tcg/translate-sve.c
@@ -4861,8 +4861,13 @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
unsigned vsz = vec_full_reg_size(s);
TCGv_ptr t_pg;
int poff;
+ uint32_t desc;
/* Load the first quadword using the normal predicated load helpers. */
+ if (!s->mte_active[0]) {
+ addr = clean_data_tbi(s, addr);
+ }
+
poff = pred_full_reg_offset(s, pg);
if (vsz > 16) {
/*
@@ -4886,7 +4891,8 @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
gen_helper_gvec_mem *fn
= ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0];
- fn(tcg_env, t_pg, addr, tcg_constant_i32(simd_desc(16, 16, zt)));
+ desc = make_svemte_desc(s, 16, 1, dtype_msz(dtype), false, zt);
+ fn(tcg_env, t_pg, addr, tcg_constant_i32(desc));
/* Replicate that first quadword. */
if (vsz > 16) {
@@ -4929,6 +4935,7 @@ static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
unsigned vsz_r32;
TCGv_ptr t_pg;
int poff, doff;
+ uint32_t desc;
if (vsz < 32) {
/*
@@ -4941,6 +4948,9 @@ static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
}
/* Load the first octaword using the normal predicated load helpers. */
+ if (!s->mte_active[0]) {
+ addr = clean_data_tbi(s, addr);
+ }
poff = pred_full_reg_offset(s, pg);
if (vsz > 32) {
@@ -4965,7 +4975,8 @@ static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype)
gen_helper_gvec_mem *fn
= ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0];
- fn(tcg_env, t_pg, addr, tcg_constant_i32(simd_desc(32, 32, zt)));
+ desc = make_svemte_desc(s, 32, 1, dtype_msz(dtype), false, zt);
+ fn(tcg_env, t_pg, addr, tcg_constant_i32(desc));
/*
* Replicate that first octaword.
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v3 6/6] target/arm: Fix SVE/SME gross MTE suppression checks
2024-02-07 2:52 [PATCH v3 0/6] target/arm: assorted mte fixes Richard Henderson
` (4 preceding siblings ...)
2024-02-07 2:52 ` [PATCH v3 5/6] target/arm: Handle mte in do_ldrq, do_ldro Richard Henderson
@ 2024-02-07 2:52 ` Richard Henderson
2024-02-07 20:09 ` [PATCH v3 0/6] target/arm: assorted mte fixes Gustavo Romero
2024-02-08 16:27 ` Peter Maydell
7 siblings, 0 replies; 14+ messages in thread
From: Richard Henderson @ 2024-02-07 2:52 UTC (permalink / raw)
To: qemu-devel; +Cc: qemu-arm, gustavo.romero, qemu-stable, Peter Maydell
The TBI and TCMA bits are located within mtedesc, not desc.
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
target/arm/tcg/sme_helper.c | 8 ++++----
target/arm/tcg/sve_helper.c | 12 ++++++------
2 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/target/arm/tcg/sme_helper.c b/target/arm/tcg/sme_helper.c
index 1ee2690ceb..904bfdac43 100644
--- a/target/arm/tcg/sme_helper.c
+++ b/target/arm/tcg/sme_helper.c
@@ -573,8 +573,8 @@ void sme_ld1_mte(CPUARMState *env, void *za, uint64_t *vg,
desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
/* Perform gross MTE suppression early. */
- if (!tbi_check(desc, bit55) ||
- tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+ if (!tbi_check(mtedesc, bit55) ||
+ tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
mtedesc = 0;
}
@@ -750,8 +750,8 @@ void sme_st1_mte(CPUARMState *env, void *za, uint64_t *vg, target_ulong addr,
desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
/* Perform gross MTE suppression early. */
- if (!tbi_check(desc, bit55) ||
- tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+ if (!tbi_check(mtedesc, bit55) ||
+ tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
mtedesc = 0;
}
diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
index bce4295d28..6853f58c19 100644
--- a/target/arm/tcg/sve_helper.c
+++ b/target/arm/tcg/sve_helper.c
@@ -5800,8 +5800,8 @@ void sve_ldN_r_mte(CPUARMState *env, uint64_t *vg, target_ulong addr,
desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
/* Perform gross MTE suppression early. */
- if (!tbi_check(desc, bit55) ||
- tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+ if (!tbi_check(mtedesc, bit55) ||
+ tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
mtedesc = 0;
}
@@ -6156,8 +6156,8 @@ void sve_ldnfff1_r_mte(CPUARMState *env, void *vg, target_ulong addr,
desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
/* Perform gross MTE suppression early. */
- if (!tbi_check(desc, bit55) ||
- tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+ if (!tbi_check(mtedesc, bit55) ||
+ tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
mtedesc = 0;
}
@@ -6410,8 +6410,8 @@ void sve_stN_r_mte(CPUARMState *env, uint64_t *vg, target_ulong addr,
desc = extract32(desc, 0, SIMD_DATA_SHIFT + SVE_MTEDESC_SHIFT);
/* Perform gross MTE suppression early. */
- if (!tbi_check(desc, bit55) ||
- tcma_check(desc, bit55, allocation_tag_from_addr(addr))) {
+ if (!tbi_check(mtedesc, bit55) ||
+ tcma_check(mtedesc, bit55, allocation_tag_from_addr(addr))) {
mtedesc = 0;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/6] linux-user/aarch64: Choose SYNC as the preferred MTE mode
2024-02-07 2:52 ` [PATCH v3 1/6] linux-user/aarch64: Choose SYNC as the preferred MTE mode Richard Henderson
@ 2024-02-07 20:03 ` Gustavo Romero
2024-02-08 16:18 ` Peter Maydell
0 siblings, 1 reply; 14+ messages in thread
From: Gustavo Romero @ 2024-02-07 20:03 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-arm, qemu-stable
On 2/6/24 11:52 PM, Richard Henderson wrote:
> The API does not generate an error for setting ASYNC | SYNC; that merely
> constrains the selection vs the per-cpu default. For qemu linux-user,
> choose SYNC as the default.
>
> Cc: qemu-stable@nongnu.org
> Reported-by: Gustavo Romero <gustavo.romero@linaro.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> linux-user/aarch64/target_prctl.h | 29 +++++++++++++++++------------
> 1 file changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/linux-user/aarch64/target_prctl.h b/linux-user/aarch64/target_prctl.h
> index 5067e7d731..aa8e203c15 100644
> --- a/linux-user/aarch64/target_prctl.h
> +++ b/linux-user/aarch64/target_prctl.h
> @@ -173,21 +173,26 @@ static abi_long do_prctl_set_tagged_addr_ctrl(CPUArchState *env, abi_long arg2)
> env->tagged_addr_enable = arg2 & PR_TAGGED_ADDR_ENABLE;
>
> if (cpu_isar_feature(aa64_mte, cpu)) {
> - switch (arg2 & PR_MTE_TCF_MASK) {
> - case PR_MTE_TCF_NONE:
> - case PR_MTE_TCF_SYNC:
> - case PR_MTE_TCF_ASYNC:
> - break;
> - default:
> - return -EINVAL;
> - }
> -
> /*
> * Write PR_MTE_TCF to SCTLR_EL1[TCF0].
> - * Note that the syscall values are consistent with hw.
> + *
> + * The kernel has a per-cpu configuration for the sysadmin,
> + * /sys/devices/system/cpu/cpu<N>/mte_tcf_preferred,
> + * which qemu does not implement.
> + *
> + * Because there is no performance difference between the modes, and
> + * because SYNC is most useful for debugging MTE errors, choose SYNC
> + * as the preferred mode. With this preference, and the way the API
> + * uses only two bits, there is no way for the program to select
> + * ASYMM mode.
> */
> - env->cp15.sctlr_el[1] =
> - deposit64(env->cp15.sctlr_el[1], 38, 2, arg2 >> PR_MTE_TCF_SHIFT);
> + unsigned tcf = 0;
> + if (arg2 & PR_MTE_TCF_SYNC) {
> + tcf = 1;
> + } else if (arg2 & PR_MTE_TCF_ASYNC) {
> + tcf = 2;
> + }
> + env->cp15.sctlr_el[1] = deposit64(env->cp15.sctlr_el[1], 38, 2, tcf);
>
> /*
> * Write PR_MTE_TAG to GCR_EL1[Exclude].
>
ok, so no ASYMM in QEMU user-mode, plus if both SYNC and ASYNC flags are
specified by the user SYNC is selected. Contrary to what happens by default
on Linux, because of the mte_tcf_preferred value, which is ASYNC, and the
final value selected is define by:
resolved_mte_tcf = (mte_ctrl & pref) ? pref : mte_ctrl; [0]
where pref is mte_tcf_preferred (CPU, the value set in sys /mte_tcf_preferred)
and mte_ctr comes from the process, i.e. is the value specified by the user in
the flags -- hence the default on Linux if both flags are specified is ASYNC,
not SYNC.
(just some notes for the records).
Thanks.
[0] https://github.com/torvalds/linux/blob/master/arch/arm64/kernel/mte.c#L180-L186
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 0/6] target/arm: assorted mte fixes
2024-02-07 2:52 [PATCH v3 0/6] target/arm: assorted mte fixes Richard Henderson
` (5 preceding siblings ...)
2024-02-07 2:52 ` [PATCH v3 6/6] target/arm: Fix SVE/SME gross MTE suppression checks Richard Henderson
@ 2024-02-07 20:09 ` Gustavo Romero
2024-02-08 16:27 ` Peter Maydell
7 siblings, 0 replies; 14+ messages in thread
From: Gustavo Romero @ 2024-02-07 20:09 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: qemu-arm
Hi Richard,
On 2/6/24 11:52 PM, Richard Henderson wrote:
> Changes for v3:
> - As if /sys/devices/system/cpu/cpu<N>/mte_tcf_preferred is "sync".
> - Fix do_st_zpa as well as do_ld_zpa. Oops.
>
> Because of the above, I dropped Gustavo's t-b.
>
>
> r~
>
>
> Richard Henderson (6):
> linux-user/aarch64: Choose SYNC as the preferred MTE mode
> target/arm: Fix nregs computation in do_{ld,st}_zpa
> target/arm: Adjust and validate mtedesc sizem1
> target/arm: Split out make_svemte_desc
> target/arm: Handle mte in do_ldrq, do_ldro
> target/arm: Fix SVE/SME gross MTE suppression checks
>
> linux-user/aarch64/target_prctl.h | 29 ++++++-----
> target/arm/internals.h | 2 +-
> target/arm/tcg/translate-a64.h | 2 +
> target/arm/tcg/sme_helper.c | 8 +--
> target/arm/tcg/sve_helper.c | 12 ++---
> target/arm/tcg/translate-sme.c | 15 ++----
> target/arm/tcg/translate-sve.c | 83 ++++++++++++++++++-------------
> 7 files changed, 83 insertions(+), 68 deletions(-)
>
Since this patchset fixes the "prctl() failed: Invalid argument"
on QEMU user-mode when both flags (ASYNC | SYNC) are passed
to prctl(PR_SET_TAGGED_ADDR_CTRL, ...), I tested it again --
expecting no different result -- so:
Tested-by: Gustavo Romero <gustavo.romero@linaro.org>
If that t-b tag doesn't make sense, feel free to drop it :)
Thanks for fixing it!
Cheers,
Gustavo
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 1/6] linux-user/aarch64: Choose SYNC as the preferred MTE mode
2024-02-07 20:03 ` Gustavo Romero
@ 2024-02-08 16:18 ` Peter Maydell
0 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2024-02-08 16:18 UTC (permalink / raw)
To: Gustavo Romero; +Cc: Richard Henderson, qemu-devel, qemu-arm, qemu-stable
On Wed, 7 Feb 2024 at 20:04, Gustavo Romero <gustavo.romero@linaro.org> wrote:
>
>
> On 2/6/24 11:52 PM, Richard Henderson wrote:
> > The API does not generate an error for setting ASYNC | SYNC; that merely
> > constrains the selection vs the per-cpu default. For qemu linux-user,
> > choose SYNC as the default.
> >
> > Cc: qemu-stable@nongnu.org
> > Reported-by: Gustavo Romero <gustavo.romero@linaro.org>
> > Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> > ---
> > linux-user/aarch64/target_prctl.h | 29 +++++++++++++++++------------
> > 1 file changed, 17 insertions(+), 12 deletions(-)
> >
> > diff --git a/linux-user/aarch64/target_prctl.h b/linux-user/aarch64/target_prctl.h
> > index 5067e7d731..aa8e203c15 100644
> > --- a/linux-user/aarch64/target_prctl.h
> > +++ b/linux-user/aarch64/target_prctl.h
> > @@ -173,21 +173,26 @@ static abi_long do_prctl_set_tagged_addr_ctrl(CPUArchState *env, abi_long arg2)
> > env->tagged_addr_enable = arg2 & PR_TAGGED_ADDR_ENABLE;
> >
> > if (cpu_isar_feature(aa64_mte, cpu)) {
> > - switch (arg2 & PR_MTE_TCF_MASK) {
> > - case PR_MTE_TCF_NONE:
> > - case PR_MTE_TCF_SYNC:
> > - case PR_MTE_TCF_ASYNC:
> > - break;
> > - default:
> > - return -EINVAL;
> > - }
> > -
> > /*
> > * Write PR_MTE_TCF to SCTLR_EL1[TCF0].
> > - * Note that the syscall values are consistent with hw.
> > + *
> > + * The kernel has a per-cpu configuration for the sysadmin,
> > + * /sys/devices/system/cpu/cpu<N>/mte_tcf_preferred,
> > + * which qemu does not implement.
> > + *
> > + * Because there is no performance difference between the modes, and
> > + * because SYNC is most useful for debugging MTE errors, choose SYNC
> > + * as the preferred mode. With this preference, and the way the API
> > + * uses only two bits, there is no way for the program to select
> > + * ASYMM mode.
> > */
> > - env->cp15.sctlr_el[1] =
> > - deposit64(env->cp15.sctlr_el[1], 38, 2, arg2 >> PR_MTE_TCF_SHIFT);
> > + unsigned tcf = 0;
> > + if (arg2 & PR_MTE_TCF_SYNC) {
> > + tcf = 1;
> > + } else if (arg2 & PR_MTE_TCF_ASYNC) {
> > + tcf = 2;
> > + }
> > + env->cp15.sctlr_el[1] = deposit64(env->cp15.sctlr_el[1], 38, 2, tcf);
> >
> > /*
> > * Write PR_MTE_TAG to GCR_EL1[Exclude].
> >
>
> ok, so no ASYMM in QEMU user-mode, plus if both SYNC and ASYNC flags are
> specified by the user SYNC is selected. Contrary to what happens by default
> on Linux, because of the mte_tcf_preferred value, which is ASYNC, and the
> final value selected is define by:
>
> resolved_mte_tcf = (mte_ctrl & pref) ? pref : mte_ctrl;
Yes; the kernel default mte_tcf_preferred is 'async', but as the comment
notes we do the equivalent of the sysadmin having set mte_tcf_preferred
to 'sync'. (The idea of the tunable is that you can set it to whatever
is a good balance between performance and debugging precision depending
on what your CPU implementation happens to be; for QEMU since there is
no performance gain from 'async' or 'asymm' we might as well act like
the sysadmin set the tunable to 'sync'.)
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 2/6] target/arm: Fix nregs computation in do_{ld, st}_zpa
2024-02-07 2:52 ` [PATCH v3 2/6] target/arm: Fix nregs computation in do_{ld,st}_zpa Richard Henderson
@ 2024-02-08 16:24 ` Peter Maydell
0 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2024-02-08 16:24 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, qemu-arm, gustavo.romero, qemu-stable
On Wed, 7 Feb 2024 at 02:53, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The field is encoded as [0-3], which is convenient for
> indexing our array of function pointers, but the true
> value is [1-4]. Adjust before calling do_mem_zpa.
>
> Add an assert, and move the comment re passing ZT to
> the helper back next to the relevant code.
>
> Cc: qemu-stable@nongnu.org
> Fixes: 206adacfb8d ("target/arm: Add mte helpers for sve scalar + int loads")
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 0/6] target/arm: assorted mte fixes
2024-02-07 2:52 [PATCH v3 0/6] target/arm: assorted mte fixes Richard Henderson
` (6 preceding siblings ...)
2024-02-07 20:09 ` [PATCH v3 0/6] target/arm: assorted mte fixes Gustavo Romero
@ 2024-02-08 16:27 ` Peter Maydell
7 siblings, 0 replies; 14+ messages in thread
From: Peter Maydell @ 2024-02-08 16:27 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, qemu-arm, gustavo.romero
On Wed, 7 Feb 2024 at 02:52, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Changes for v3:
> - As if /sys/devices/system/cpu/cpu<N>/mte_tcf_preferred is "sync".
> - Fix do_st_zpa as well as do_ld_zpa. Oops.
>
> Because of the above, I dropped Gustavo's t-b.
>
>
> r~
>
>
> Richard Henderson (6):
> linux-user/aarch64: Choose SYNC as the preferred MTE mode
> target/arm: Fix nregs computation in do_{ld,st}_zpa
> target/arm: Adjust and validate mtedesc sizem1
> target/arm: Split out make_svemte_desc
> target/arm: Handle mte in do_ldrq, do_ldro
> target/arm: Fix SVE/SME gross MTE suppression checks
Applied to target-arm.next, thanks.
-- PMM
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 3/6] target/arm: Adjust and validate mtedesc sizem1
2024-02-07 2:52 ` [PATCH v3 3/6] target/arm: Adjust and validate mtedesc sizem1 Richard Henderson
@ 2024-02-16 15:12 ` Michael Tokarev
2024-02-16 19:17 ` Richard Henderson
0 siblings, 1 reply; 14+ messages in thread
From: Michael Tokarev @ 2024-02-16 15:12 UTC (permalink / raw)
To: Richard Henderson, qemu-devel
Cc: qemu-arm, gustavo.romero, qemu-stable, Peter Maydell
07.02.2024 05:52, Richard Henderson :
> When we added SVE_MTEDESC_SHIFT, we effectively limited the
> maximum size of MTEDESC. Adjust SIZEM1 to consume the remaining
> bits (32 - 10 - 5 - 12 == 5). Assert that the data to be stored
> fits within the field (expecting 8 * 4 - 1 == 31, exact fit).
>
> Cc: qemu-stable@nongnu.org
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> target/arm/internals.h | 2 +-
> target/arm/tcg/translate-sve.c | 7 ++++---
> 2 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/target/arm/internals.h b/target/arm/internals.h
> index fc337fe40e..50bff44549 100644
> --- a/target/arm/internals.h
> +++ b/target/arm/internals.h
> @@ -1278,7 +1278,7 @@ FIELD(MTEDESC, TBI, 4, 2)
> FIELD(MTEDESC, TCMA, 6, 2)
> FIELD(MTEDESC, WRITE, 8, 1)
> FIELD(MTEDESC, ALIGN, 9, 3)
> -FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - 12) /* size - 1 */
> +FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - SVE_MTEDESC_SHIFT - 12) /* size - 1 */
>
> bool mte_probe(CPUARMState *env, uint32_t desc, uint64_t ptr);
> uint64_t mte_check(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra);
> diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
> index 7108938251..a88e523cba 100644
> --- a/target/arm/tcg/translate-sve.c
> +++ b/target/arm/tcg/translate-sve.c
> @@ -4443,17 +4443,18 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr,
> {
> unsigned vsz = vec_full_reg_size(s);
> TCGv_ptr t_pg;
> + uint32_t sizem1;
> int desc = 0;
>
> assert(mte_n >= 1 && mte_n <= 4);
> + sizem1 = (mte_n << dtype_msz(dtype)) - 1;
> + assert(sizem1 <= R_MTEDESC_SIZEM1_MASK >> R_MTEDESC_SIZEM1_SHIFT);
> if (s->mte_active[0]) {
> - int msz = dtype_msz(dtype);
> -
> desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
> desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
> desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
> desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
> - desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (mte_n << msz) - 1);
> + desc = FIELD_DP32(desc, MTEDESC, SIZEM1, sizem1);
> desc <<= SVE_MTEDESC_SHIFT;
> } else {
> addr = clean_data_tbi(s, addr);
There's no question about stable-8.2 here, this change needed there.
But I've a question about stable-7.2 - does it make sense to pick this
one up for 7.2? It quickly goes out of control, because this one
is on top of
523da6b963455ce0a0e8d572d98d9cd91f952785 target/arm: Check alignment in helper_mte_check
(this one might be good for 7.2 by its own)
which needs:
3b97520c86e704b0533627c26b98173b71834bad target/arm: Pass single_memop to gen_mte_checkN
which needs:
6f47e7c18972802c428a5e03eb52a8f0a7bebe5c target/arm: Load/store integer pair with one tcg operation
which needs:
needs 128bit ops
659aed5feda4472d8aed4ccc69e125bba2af8b89 target/arm: Drop tcg_temp_free from translator-a64.c
...
So I think it's not a good idea to go down this hole..
Probably ditto for the other two:
target/arm: Split out make_svemte_desc
target/arm: Handle mte in do_ldrq, do_ldro
Makes sense? Or it's better to do a proper backport?
Thanks,
/mjt
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v3 3/6] target/arm: Adjust and validate mtedesc sizem1
2024-02-16 15:12 ` Michael Tokarev
@ 2024-02-16 19:17 ` Richard Henderson
0 siblings, 0 replies; 14+ messages in thread
From: Richard Henderson @ 2024-02-16 19:17 UTC (permalink / raw)
To: Michael Tokarev, qemu-devel
Cc: qemu-arm, gustavo.romero, qemu-stable, Peter Maydell
On 2/16/24 05:12, Michael Tokarev wrote:
> 07.02.2024 05:52, Richard Henderson :
>> When we added SVE_MTEDESC_SHIFT, we effectively limited the
>> maximum size of MTEDESC. Adjust SIZEM1 to consume the remaining
>> bits (32 - 10 - 5 - 12 == 5). Assert that the data to be stored
>> fits within the field (expecting 8 * 4 - 1 == 31, exact fit).
>>
>> Cc: qemu-stable@nongnu.org
>> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>> target/arm/internals.h | 2 +-
>> target/arm/tcg/translate-sve.c | 7 ++++---
>> 2 files changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/target/arm/internals.h b/target/arm/internals.h
>> index fc337fe40e..50bff44549 100644
>> --- a/target/arm/internals.h
>> +++ b/target/arm/internals.h
>> @@ -1278,7 +1278,7 @@ FIELD(MTEDESC, TBI, 4, 2)
>> FIELD(MTEDESC, TCMA, 6, 2)
>> FIELD(MTEDESC, WRITE, 8, 1)
>> FIELD(MTEDESC, ALIGN, 9, 3)
>> -FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - 12) /* size - 1 */
>> +FIELD(MTEDESC, SIZEM1, 12, SIMD_DATA_BITS - SVE_MTEDESC_SHIFT - 12) /* size - 1 */
>> bool mte_probe(CPUARMState *env, uint32_t desc, uint64_t ptr);
>> uint64_t mte_check(CPUARMState *env, uint32_t desc, uint64_t ptr, uintptr_t ra);
>> diff --git a/target/arm/tcg/translate-sve.c b/target/arm/tcg/translate-sve.c
>> index 7108938251..a88e523cba 100644
>> --- a/target/arm/tcg/translate-sve.c
>> +++ b/target/arm/tcg/translate-sve.c
>> @@ -4443,17 +4443,18 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64
>> addr,
>> {
>> unsigned vsz = vec_full_reg_size(s);
>> TCGv_ptr t_pg;
>> + uint32_t sizem1;
>> int desc = 0;
>> assert(mte_n >= 1 && mte_n <= 4);
>> + sizem1 = (mte_n << dtype_msz(dtype)) - 1;
>> + assert(sizem1 <= R_MTEDESC_SIZEM1_MASK >> R_MTEDESC_SIZEM1_SHIFT);
>> if (s->mte_active[0]) {
>> - int msz = dtype_msz(dtype);
>> -
>> desc = FIELD_DP32(desc, MTEDESC, MIDX, get_mem_index(s));
>> desc = FIELD_DP32(desc, MTEDESC, TBI, s->tbid);
>> desc = FIELD_DP32(desc, MTEDESC, TCMA, s->tcma);
>> desc = FIELD_DP32(desc, MTEDESC, WRITE, is_write);
>> - desc = FIELD_DP32(desc, MTEDESC, SIZEM1, (mte_n << msz) - 1);
>> + desc = FIELD_DP32(desc, MTEDESC, SIZEM1, sizem1);
>> desc <<= SVE_MTEDESC_SHIFT;
>> } else {
>> addr = clean_data_tbi(s, addr);
>
> There's no question about stable-8.2 here, this change needed there.
> But I've a question about stable-7.2 - does it make sense to pick this
> one up for 7.2? It quickly goes out of control, because this one
> is on top of
>
> 523da6b963455ce0a0e8d572d98d9cd91f952785 target/arm: Check alignment in helper_mte_check
> (this one might be good for 7.2 by its own)
> which needs:
> 3b97520c86e704b0533627c26b98173b71834bad target/arm: Pass single_memop to gen_mte_checkN
> which needs:
> 6f47e7c18972802c428a5e03eb52a8f0a7bebe5c target/arm: Load/store integer pair with one
> tcg operation
> which needs:
> needs 128bit ops
> 659aed5feda4472d8aed4ccc69e125bba2af8b89 target/arm: Drop tcg_temp_free from
> translator-a64.c
> ...
>
> So I think it's not a good idea to go down this hole..
>
> Probably ditto for the other two:
> target/arm: Split out make_svemte_desc
> target/arm: Handle mte in do_ldrq, do_ldro
>
> Makes sense? Or it's better to do a proper backport?
I don't think it makes sense to go back too far.
r~
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2024-02-16 19:18 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-07 2:52 [PATCH v3 0/6] target/arm: assorted mte fixes Richard Henderson
2024-02-07 2:52 ` [PATCH v3 1/6] linux-user/aarch64: Choose SYNC as the preferred MTE mode Richard Henderson
2024-02-07 20:03 ` Gustavo Romero
2024-02-08 16:18 ` Peter Maydell
2024-02-07 2:52 ` [PATCH v3 2/6] target/arm: Fix nregs computation in do_{ld,st}_zpa Richard Henderson
2024-02-08 16:24 ` [PATCH v3 2/6] target/arm: Fix nregs computation in do_{ld, st}_zpa Peter Maydell
2024-02-07 2:52 ` [PATCH v3 3/6] target/arm: Adjust and validate mtedesc sizem1 Richard Henderson
2024-02-16 15:12 ` Michael Tokarev
2024-02-16 19:17 ` Richard Henderson
2024-02-07 2:52 ` [PATCH v3 4/6] target/arm: Split out make_svemte_desc Richard Henderson
2024-02-07 2:52 ` [PATCH v3 5/6] target/arm: Handle mte in do_ldrq, do_ldro Richard Henderson
2024-02-07 2:52 ` [PATCH v3 6/6] target/arm: Fix SVE/SME gross MTE suppression checks Richard Henderson
2024-02-07 20:09 ` [PATCH v3 0/6] target/arm: assorted mte fixes Gustavo Romero
2024-02-08 16:27 ` Peter Maydell
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.