[Qemu-devel] [PATCH v3 0/2] target-mips: Add support for misaligned accesses

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v3 0/2] target-mips: Add support for misaligned accesses
@ 2015-05-13 15:37 Yongbok Kim
  2015-05-13 15:37 ` [Qemu-devel] [PATCH v3 1/2] target-mips: Misaligned memory accesses for R6 Yongbok Kim
  2015-05-13 15:37 ` [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA Yongbok Kim
  0 siblings, 2 replies; 22+ messages in thread
From: Yongbok Kim @ 2015-05-13 15:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, leon.alrae, afaerber, rth

This patch set adds support for misaligned memory accesses in MIPS architecture
Release 6 and MIPS SIMD Architecture.

The behaviour, semantics, and architecture specifications of misaligned memory
accesses are described in:
MIPS Architecture For Programmers Volume I-A: Introduction to the MIPS64
Architecture, Appendix B Misaligned Memory Accesses.
Available at http://www.imgtec.com/mips/architectures/mips64.asp

Regards,
Yongbok

v3:
* Rewrote MSA patch
* Work-around is using byte-to-byte accesses and endianness corrections for 
  R5+MSA. (This replaces the misaligned flag from v2.) (Leon)
* Bug fixes (Leon)
* Separate helper functions for each data formats

v2:
* Removed re-translation in the mips_cpu_do_unaligned_access() (Peter)
* Checks validity only if an access is spanning into two pages in MSA (Leon)
* Introduced misaligned flag to indicate MSA ld/st is ongoing, is used to
  allow misaligned accesses in the mips_cpu_do_unaligned_access() callback.
  This is crucial to support MSA misaligned accesses in Release 5 cores.

Yongbok Kim (2):
  target-mips: Misaligned memory accesses for R6
  target-mips: Misaligned memory accesses for MSA

 target-mips/helper.h         |   10 ++-
 target-mips/op_helper.c      |  149 ++++++++++++++++++++++++-----------------
 target-mips/translate.c      |   23 +++++--
 target-mips/translate_init.c |    2 +-
 4 files changed, 112 insertions(+), 72 deletions(-)

-- 
1.7.5.4

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v3 1/2] target-mips: Misaligned memory accesses for R6
  2015-05-13 15:37 [Qemu-devel] [PATCH v3 0/2] target-mips: Add support for misaligned accesses Yongbok Kim
@ 2015-05-13 15:37 ` Yongbok Kim
  2015-05-13 15:37 ` [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA Yongbok Kim
  1 sibling, 0 replies; 22+ messages in thread
From: Yongbok Kim @ 2015-05-13 15:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, leon.alrae, afaerber, rth

Release 6 requires misaligned memory access support for all ordinary memory
access instructions (for example, LW/SW, LWC1/SWC1).
However misaligned support is not provided for certain special memory accesses
such as atomics (for example, LL/SC).

Allows misaligned accesses from mips_cpu_do_unaligned_access() callback,
if it is a R6 core. As the helper functions of LL/SC is checking misalignment,
just allowing all for R6 is good enough.

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
---
 target-mips/op_helper.c      |    7 +++++++
 target-mips/translate_init.c |    2 +-
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 73a8e45..58f02cf 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -2215,6 +2215,13 @@ void mips_cpu_do_unaligned_access(CPUState *cs, vaddr addr,
     int error_code = 0;
     int excp;
 
+    if (env->insn_flags & ISA_MIPS32R6) {
+        /* Release 6 provides support for misaligned memory access for
+         * all ordinary memory reference instructions
+         * */
+        return;
+    }
+
     env->CP0_BadVAddr = addr;
 
     if (access_type == MMU_DATA_STORE) {
diff --git a/target-mips/translate_init.c b/target-mips/translate_init.c
index 85a65e7..ec54fef 100644
--- a/target-mips/translate_init.c
+++ b/target-mips/translate_init.c
@@ -607,7 +607,7 @@ static const mips_def_t mips_defs[] =
     },
     {
         /* A generic CPU supporting MIPS64 Release 6 ISA.
-           FIXME: Support IEEE 754-2008 FP and misaligned memory accesses.
+           FIXME: Support IEEE 754-2008 FP.
                   Eventually this should be replaced by a real CPU model. */
         .name = "MIPS64R6-generic",
         .CP0_PRid = 0x00010000,
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-13 15:37 [Qemu-devel] [PATCH v3 0/2] target-mips: Add support for misaligned accesses Yongbok Kim
  2015-05-13 15:37 ` [Qemu-devel] [PATCH v3 1/2] target-mips: Misaligned memory accesses for R6 Yongbok Kim
@ 2015-05-13 15:37 ` Yongbok Kim
  2015-05-13 19:28   ` Richard Henderson
  1 sibling, 1 reply; 22+ messages in thread
From: Yongbok Kim @ 2015-05-13 15:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, leon.alrae, afaerber, rth

MIPS SIMD Architecture vector loads and stores require misalignment support.
MSA Memory access should work as an atomic operation. Therefore, it has to
check validity of all addresses for an access if it is spanning into two pages.

For a case of R5 with MSA, there is no clear solution to distinguish
instructions support misaligned or not, a work-around which is using
byte-to-byte accesses and endianness corrections has been introduced.

Separating helper functions for each data format as format is known in
translation.
To use mmu_idx from cpu_mmu_index() instead of calculating it from hflag.

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
---
 target-mips/helper.h    |   10 +++-
 target-mips/op_helper.c |  142 ++++++++++++++++++++++++++--------------------
 target-mips/translate.c |   23 +++++--
 3 files changed, 104 insertions(+), 71 deletions(-)

diff --git a/target-mips/helper.h b/target-mips/helper.h
index 3bd0b02..c532276 100644
--- a/target-mips/helper.h
+++ b/target-mips/helper.h
@@ -931,5 +931,11 @@ DEF_HELPER_4(msa_ftint_u_df, void, env, i32, i32, i32)
 DEF_HELPER_4(msa_ffint_s_df, void, env, i32, i32, i32)
 DEF_HELPER_4(msa_ffint_u_df, void, env, i32, i32, i32)
 
-DEF_HELPER_5(msa_ld_df, void, env, i32, i32, i32, s32)
-DEF_HELPER_5(msa_st_df, void, env, i32, i32, i32, s32)
+#define MSALDST_PROTO(type)                                 \
+DEF_HELPER_4(msa_ld_ ## type, void, env, i32, i32, s32)     \
+DEF_HELPER_4(msa_st_ ## type, void, env, i32, i32, s32)
+MSALDST_PROTO(b)
+MSALDST_PROTO(h)
+MSALDST_PROTO(w)
+MSALDST_PROTO(d)
+#undef MSALDST_PROTO
diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 58f02cf..6b49162 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -3565,72 +3565,90 @@ FOP_CONDN_S(sne,  (float32_lt(fst1, fst0, &env->active_fpu.fp_status)
 /* Element-by-element access macros */
 #define DF_ELEMENTS(df) (MSA_WRLEN / DF_BITS(df))
 
-void helper_msa_ld_df(CPUMIPSState *env, uint32_t df, uint32_t wd, uint32_t rs,
-                     int32_t s10)
+static inline void ensure_atomic_msa_block_access(CPUMIPSState *env,
+                                                  target_ulong addr,
+                                                  int rw,
+                                                  int mmu_idx)
 {
-    wr_t *pwd = &(env->active_fpu.fpr[wd].wr);
-    target_ulong addr = env->active_tc.gpr[rs] + (s10 << df);
-    int i;
+#if !defined(CONFIG_USER_ONLY)
+#define MSA_PAGESPAN(x) (unlikely((((x) & ~TARGET_PAGE_MASK)                \
+                                   + MSA_WRLEN/8 - 1) >= TARGET_PAGE_SIZE))
+    CPUState *cs = CPU(mips_env_get_cpu(env));
+    target_ulong page_addr;
 
-    switch (df) {
-    case DF_BYTE:
-        for (i = 0; i < DF_ELEMENTS(DF_BYTE); i++) {
-            pwd->b[i] = do_lbu(env, addr + (i << DF_BYTE),
-                                env->hflags & MIPS_HFLAG_KSU);
-        }
-        break;
-    case DF_HALF:
-        for (i = 0; i < DF_ELEMENTS(DF_HALF); i++) {
-            pwd->h[i] = do_lhu(env, addr + (i << DF_HALF),
-                                env->hflags & MIPS_HFLAG_KSU);
-        }
-        break;
-    case DF_WORD:
-        for (i = 0; i < DF_ELEMENTS(DF_WORD); i++) {
-            pwd->w[i] = do_lw(env, addr + (i << DF_WORD),
-                                env->hflags & MIPS_HFLAG_KSU);
-        }
-        break;
-    case DF_DOUBLE:
-        for (i = 0; i < DF_ELEMENTS(DF_DOUBLE); i++) {
-            pwd->d[i] = do_ld(env, addr + (i << DF_DOUBLE),
-                                env->hflags & MIPS_HFLAG_KSU);
-        }
-        break;
+    if (MSA_PAGESPAN(addr)) {
+        /* first page */
+        tlb_fill(cs, addr, rw, mmu_idx, 0);
+        /* second page */
+        page_addr = (addr & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
+        tlb_fill(cs, page_addr, rw, mmu_idx, 0);
     }
+#endif
 }
 
-void helper_msa_st_df(CPUMIPSState *env, uint32_t df, uint32_t wd, uint32_t rs,
-                     int32_t s10)
-{
-    wr_t *pwd = &(env->active_fpu.fpr[wd].wr);
-    target_ulong addr = env->active_tc.gpr[rs] + (s10 << df);
-    int i;
+#define CORRECT_VECTOR_ENDIANNESS(DF, TYPE, SWAP)                           \
+static void correct_vector_endianness_ ## TYPE(wr_t *pwd, wr_t *pws)        \
+{                                                                           \
+    int i;                                                                  \
+    for (i = 0; i < DF_ELEMENTS(DF); i++) {                                 \
+        pwd->TYPE[i] = SWAP(pws->TYPE[i]);                                  \
+    }                                                                       \
+}
+CORRECT_VECTOR_ENDIANNESS(DF_BYTE,   b, /* assign */)
+CORRECT_VECTOR_ENDIANNESS(DF_HALF,   h, tswap16)
+CORRECT_VECTOR_ENDIANNESS(DF_WORD,   w, tswap32)
+CORRECT_VECTOR_ENDIANNESS(DF_DOUBLE, d, tswap64)
+
+#define MSA_LD_DF(DF, TYPE, LD_INSN)                                    \
+void helper_msa_ld_ ## TYPE(CPUMIPSState *env, uint32_t wd, uint32_t rs,\
+                     int32_t s10)                                       \
+{                                                                       \
+    wr_t *pwd = &(env->active_fpu.fpr[wd].wr);                          \
+    target_ulong addr = env->active_tc.gpr[rs] + (s10 << DF);           \
+    int i;                                                              \
+    int mmu_idx = cpu_mmu_index(env);                                   \
+    ensure_atomic_msa_block_access(env, addr, MMU_DATA_LOAD, mmu_idx);  \
+    if (unlikely(addr & ((1 << DF) - 1))) {                             \
+        /* work-around for misaligned accesses */                       \
+        for (i = 0; i < DF_ELEMENTS(DF_BYTE); i++) {                    \
+            pwd->b[i] = do_lbu(env, addr + (i << DF_BYTE), mmu_idx);    \
+        }                                                               \
+        correct_vector_endianness_ ## TYPE(pwd, pwd);                   \
+    } else {                                                            \
+        for (i = 0; i < DF_ELEMENTS(DF); i++) {                         \
+            pwd->TYPE[i] = LD_INSN(env, addr + (i << DF), mmu_idx);     \
+        }                                                               \
+    }                                                                   \
+}
+MSA_LD_DF(DF_BYTE,   b, do_lbu)
+MSA_LD_DF(DF_HALF,   h, do_lhu)
+MSA_LD_DF(DF_WORD,   w, do_lw)
+MSA_LD_DF(DF_DOUBLE, d, do_ld)
 
-    switch (df) {
-    case DF_BYTE:
-        for (i = 0; i < DF_ELEMENTS(DF_BYTE); i++) {
-            do_sb(env, addr + (i << DF_BYTE), pwd->b[i],
-                    env->hflags & MIPS_HFLAG_KSU);
-        }
-        break;
-    case DF_HALF:
-        for (i = 0; i < DF_ELEMENTS(DF_HALF); i++) {
-            do_sh(env, addr + (i << DF_HALF), pwd->h[i],
-                    env->hflags & MIPS_HFLAG_KSU);
-        }
-        break;
-    case DF_WORD:
-        for (i = 0; i < DF_ELEMENTS(DF_WORD); i++) {
-            do_sw(env, addr + (i << DF_WORD), pwd->w[i],
-                    env->hflags & MIPS_HFLAG_KSU);
-        }
-        break;
-    case DF_DOUBLE:
-        for (i = 0; i < DF_ELEMENTS(DF_DOUBLE); i++) {
-            do_sd(env, addr + (i << DF_DOUBLE), pwd->d[i],
-                    env->hflags & MIPS_HFLAG_KSU);
-        }
-        break;
-    }
+#define MSA_ST_DF(DF, TYPE, ST_INSN)                                    \
+void helper_msa_st_ ## TYPE(CPUMIPSState *env, uint32_t wd, uint32_t rs,\
+                     int32_t s10)                                       \
+{                                                                       \
+    wr_t *pwd = &(env->active_fpu.fpr[wd].wr);                          \
+    target_ulong addr = env->active_tc.gpr[rs] + (s10 << DF);           \
+    int i;                                                              \
+    int mmu_idx = cpu_mmu_index(env);                                   \
+    ensure_atomic_msa_block_access(env, addr, MMU_DATA_STORE, mmu_idx); \
+    if (unlikely(addr & ((1 << DF) - 1))) {                             \
+        /* work-around for misaligned accesses */                       \
+        wr_t wx;                                                        \
+        correct_vector_endianness_ ## TYPE(&wx, pwd);                   \
+        for (i = 0; i < DF_ELEMENTS(DF_BYTE); i++) {                    \
+            do_sb(env, addr + (i << DF_BYTE), wx.b[i], mmu_idx);        \
+        }                                                               \
+    } else {                                                            \
+        for (i = 0; i < DF_ELEMENTS(DF); i++) {                         \
+            ST_INSN(env, addr + (i << DF), pwd->TYPE[i], mmu_idx);      \
+        }                                                               \
+    }                                                                   \
 }
+MSA_ST_DF(DF_BYTE,   b, do_sb)
+MSA_ST_DF(DF_HALF,   h, do_sh)
+MSA_ST_DF(DF_WORD,   w, do_sw)
+MSA_ST_DF(DF_DOUBLE, d, do_sd)
+
diff --git a/target-mips/translate.c b/target-mips/translate.c
index fd063a2..59c0aeb 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -18402,32 +18402,41 @@ static void gen_msa(CPUMIPSState *env, DisasContext *ctx)
             int32_t s10 = sextract32(ctx->opcode, 16, 10);
             uint8_t rs = (ctx->opcode >> 11) & 0x1f;
             uint8_t wd = (ctx->opcode >> 6) & 0x1f;
-            uint8_t df = (ctx->opcode >> 0) & 0x3;
 
-            TCGv_i32 tdf = tcg_const_i32(df);
             TCGv_i32 twd = tcg_const_i32(wd);
             TCGv_i32 trs = tcg_const_i32(rs);
             TCGv_i32 ts10 = tcg_const_i32(s10);
 
+            save_cpu_state(ctx, 1);
+
             switch (MASK_MSA_MINOR(opcode)) {
             case OPC_LD_B:
+                gen_helper_msa_ld_b(cpu_env, twd, trs, ts10);
+                break;
             case OPC_LD_H:
+                gen_helper_msa_ld_h(cpu_env, twd, trs, ts10);
+                break;
             case OPC_LD_W:
+                gen_helper_msa_ld_w(cpu_env, twd, trs, ts10);
+                break;
             case OPC_LD_D:
-                save_cpu_state(ctx, 1);
-                gen_helper_msa_ld_df(cpu_env, tdf, twd, trs, ts10);
+                gen_helper_msa_ld_d(cpu_env, twd, trs, ts10);
                 break;
             case OPC_ST_B:
+                gen_helper_msa_st_b(cpu_env, twd, trs, ts10);
+                break;
             case OPC_ST_H:
+                gen_helper_msa_st_h(cpu_env, twd, trs, ts10);
+                break;
             case OPC_ST_W:
+                gen_helper_msa_st_w(cpu_env, twd, trs, ts10);
+                break;
             case OPC_ST_D:
-                save_cpu_state(ctx, 1);
-                gen_helper_msa_st_df(cpu_env, tdf, twd, trs, ts10);
+                gen_helper_msa_st_d(cpu_env, twd, trs, ts10);
                 break;
             }
 
             tcg_temp_free_i32(twd);
-            tcg_temp_free_i32(tdf);
             tcg_temp_free_i32(trs);
             tcg_temp_free_i32(ts10);
         }
-- 
1.7.5.4

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-13 15:37 ` [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA Yongbok Kim
@ 2015-05-13 19:28   ` Richard Henderson
  2015-05-13 19:56     ` Maciej W. Rozycki
                       ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Richard Henderson @ 2015-05-13 19:28 UTC (permalink / raw)
  To: Yongbok Kim, qemu-devel; +Cc: peter.maydell, leon.alrae, afaerber

On 05/13/2015 08:37 AM, Yongbok Kim wrote:
> +static inline void ensure_atomic_msa_block_access(CPUMIPSState *env,
> +                                                  target_ulong addr,
> +                                                  int rw,
> +                                                  int mmu_idx)
>  {
> +#if !defined(CONFIG_USER_ONLY)
> +#define MSA_PAGESPAN(x) (unlikely((((x) & ~TARGET_PAGE_MASK)                \
> +                                   + MSA_WRLEN/8 - 1) >= TARGET_PAGE_SIZE))
> +    CPUState *cs = CPU(mips_env_get_cpu(env));
> +    target_ulong page_addr;
>  
> +    if (MSA_PAGESPAN(addr)) {
> +        /* first page */
> +        tlb_fill(cs, addr, rw, mmu_idx, 0);
> +        /* second page */
> +        page_addr = (addr & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
> +        tlb_fill(cs, page_addr, rw, mmu_idx, 0);
>      }
> +#endif
>  }

This doesn't do quite what you think it does.  It does trap if the page isn't
mapped at all, but it doesn't trap if e.g. rw is set and the page is read-only.
That requires a subsequent check for what permissions were installed by
tlb_set_page.

I had thought there was a way to look this up besides duplicating the code in
softmmu_template.h, but I suppose that's in a patch set that never made it in.


> +    if (unlikely(addr & ((1 << DF) - 1))) {                             \
> +        /* work-around for misaligned accesses */                       \
> +        for (i = 0; i < DF_ELEMENTS(DF_BYTE); i++) {                    \
> +            pwd->b[i] = do_lbu(env, addr + (i << DF_BYTE), mmu_idx);    \
> +        }                                                               \
> +        correct_vector_endianness_ ## TYPE(pwd, pwd);                   \

Why byte accesses?  The softmmu helpers are guaranteed to support misalignment.

As an aside, consider moving away from

#define HELPER_LD(name, insn, type)                                     \
static inline type do_##name(CPUMIPSState *env, target_ulong addr,      \
                             int mem_idx)                               \
{                                                                       \
    switch (mem_idx)                                                    \
    {                                                                   \
    case 0: return (type) cpu_##insn##_kernel(env, addr); break;        \
    case 1: return (type) cpu_##insn##_super(env, addr); break;         \
    default:                                                            \
    case 2: return (type) cpu_##insn##_user(env, addr); break;          \
    }                                                                   \
}

to using helper_ret_*_mmu directly.  Which allows you to specify the mmu_idx
directly rather than bouncing around different thunks.  It also allows you to
pass in GETRA(), which would allow these helpers to use cpu_restore_state on
faults.


r~

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-13 19:28   ` Richard Henderson
@ 2015-05-13 19:56     ` Maciej W. Rozycki
  2015-05-13 19:58       ` Richard Henderson
  2015-05-14  9:00     ` Yongbok Kim
  2015-05-14  9:50     ` Leon Alrae
  2 siblings, 1 reply; 22+ messages in thread
From: Maciej W. Rozycki @ 2015-05-13 19:56 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Yongbok Kim, peter.maydell, leon.alrae, qemu-devel, afaerber

On Wed, 13 May 2015, Richard Henderson wrote:

> > +static inline void ensure_atomic_msa_block_access(CPUMIPSState *env,
> > +                                                  target_ulong addr,
> > +                                                  int rw,
> > +                                                  int mmu_idx)
> >  {
> > +#if !defined(CONFIG_USER_ONLY)
> > +#define MSA_PAGESPAN(x) (unlikely((((x) & ~TARGET_PAGE_MASK)                \
> > +                                   + MSA_WRLEN/8 - 1) >= TARGET_PAGE_SIZE))
> > +    CPUState *cs = CPU(mips_env_get_cpu(env));
> > +    target_ulong page_addr;
> >  
> > +    if (MSA_PAGESPAN(addr)) {
> > +        /* first page */
> > +        tlb_fill(cs, addr, rw, mmu_idx, 0);
> > +        /* second page */
> > +        page_addr = (addr & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
> > +        tlb_fill(cs, page_addr, rw, mmu_idx, 0);
> >      }
> > +#endif
> >  }
> 
> This doesn't do quite what you think it does.  It does trap if the page isn't
> mapped at all, but it doesn't trap if e.g. rw is set and the page is read-only.
> That requires a subsequent check for what permissions were installed by
> tlb_set_page.
> 
> I had thought there was a way to look this up besides duplicating the code in
> softmmu_template.h, but I suppose that's in a patch set that never made it in.

 We must have a way to deal with memory access operations issued by a 
single machine instruction crossing a page boundary already as this is 
what MIPS16 SAVE and RESTORE instructions as well as microMIPS SWP, SDP, 
SWM, SDM, LWP, LDP, LWM and LDM ones do.  Perhaps these are worth 
looking into and their approach copying (or reusing) here?

  Maciej

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-13 19:56     ` Maciej W. Rozycki
@ 2015-05-13 19:58       ` Richard Henderson
  2015-05-13 20:59         ` Leon Alrae
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2015-05-13 19:58 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Yongbok Kim, peter.maydell, leon.alrae, qemu-devel, afaerber

On 05/13/2015 12:56 PM, Maciej W. Rozycki wrote:
>  We must have a way to deal with memory access operations issued by a 
> single machine instruction crossing a page boundary already as this is 
> what MIPS16 SAVE and RESTORE instructions as well as microMIPS SWP, SDP, 
> SWM, SDM, LWP, LDP, LWM and LDM ones do.  Perhaps these are worth 
> looking into and their approach copying (or reusing) here?

Certainly we do.  It's all in softmmu_template.h.


r~

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-13 19:58       ` Richard Henderson
@ 2015-05-13 20:59         ` Leon Alrae
  2015-05-13 21:21           ` Maciej W. Rozycki
  2015-05-13 21:31           ` Richard Henderson
  0 siblings, 2 replies; 22+ messages in thread
From: Leon Alrae @ 2015-05-13 20:59 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Yongbok Kim, peter.maydell, qemu-devel, Maciej W. Rozycki, afaerber

On 13/05/15 20:58, Richard Henderson wrote:
> On 05/13/2015 12:56 PM, Maciej W. Rozycki wrote:
>>  We must have a way to deal with memory access operations issued by a 
>> single machine instruction crossing a page boundary already as this is 
>> what MIPS16 SAVE and RESTORE instructions as well as microMIPS SWP, SDP, 
>> SWM, SDM, LWP, LDP, LWM and LDM ones do.  Perhaps these are worth 
>> looking into and their approach copying (or reusing) here?
> 
> Certainly we do.  It's all in softmmu_template.h.

I believe the problem is that MSA vector register's size is 16-bytes
(this DATA_SIZE isn't supported in softmmu_template) and MSA load/store
is supposed to be atomic.

Leon

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-13 20:59         ` Leon Alrae
@ 2015-05-13 21:21           ` Maciej W. Rozycki
  2015-05-13 21:36             ` Richard Henderson
  2015-05-13 21:31           ` Richard Henderson
  1 sibling, 1 reply; 22+ messages in thread
From: Maciej W. Rozycki @ 2015-05-13 21:21 UTC (permalink / raw)
  To: Leon Alrae
  Cc: Yongbok Kim, peter.maydell, qemu-devel, afaerber, Richard Henderson

On Wed, 13 May 2015, Leon Alrae wrote:

> > Certainly we do.  It's all in softmmu_template.h.
> 
> I believe the problem is that MSA vector register's size is 16-bytes
> (this DATA_SIZE isn't supported in softmmu_template) and MSA load/store
> is supposed to be atomic.

 Not really AFAICT.  Here's what the specification says[1]:

"The vector load instruction is atomic at the element level with no 
guaranteed ordering among elements, i.e. each element load is an atomic 
operation issued in no particular order with respect to the element's 
vector position."

and[2]:

"The vector store instruction is atomic at the element level with no 
guaranteed ordering among elements, i.e. each element store is an atomic 
operation issued in no particular order with respect to the element's 
vector position."

so you only need to get atomic up to 8 bytes (with LD.D and ST.D, less 
with the narrower vector elements), and that looks supported to me.

References:

[1] "MIPS Architecture for Programmers, Volume IV-j: The MIPS32 SIMD 
    Architecture Module", Revision 1.07, MIPS Technologies, Inc., 
    Document Number: MD00866, October 2, 2013, p. 314

[2] same, p. 414

  Maciej

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-13 20:59         ` Leon Alrae
  2015-05-13 21:21           ` Maciej W. Rozycki
@ 2015-05-13 21:31           ` Richard Henderson
  1 sibling, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2015-05-13 21:31 UTC (permalink / raw)
  To: Leon Alrae
  Cc: Yongbok Kim, peter.maydell, qemu-devel, Maciej W. Rozycki, afaerber

On 05/13/2015 01:59 PM, Leon Alrae wrote:
> On 13/05/15 20:58, Richard Henderson wrote:
>> On 05/13/2015 12:56 PM, Maciej W. Rozycki wrote:
>>>  We must have a way to deal with memory access operations issued by a 
>>> single machine instruction crossing a page boundary already as this is 
>>> what MIPS16 SAVE and RESTORE instructions as well as microMIPS SWP, SDP, 
>>> SWM, SDM, LWP, LDP, LWM and LDM ones do.  Perhaps these are worth 
>>> looking into and their approach copying (or reusing) here?
>>
>> Certainly we do.  It's all in softmmu_template.h.
> 
> I believe the problem is that MSA vector register's size is 16-bytes
> (this DATA_SIZE isn't supported in softmmu_template) and MSA load/store
> is supposed to be atomic.

What I meant is that the code to handle page boundary crosses is there in
softmmu_template.h.  If you want to copy that mechanism into the mips backend,
that's where you should start.


r~

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-13 21:21           ` Maciej W. Rozycki
@ 2015-05-13 21:36             ` Richard Henderson
  2015-05-13 22:54               ` Maciej W. Rozycki
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2015-05-13 21:36 UTC (permalink / raw)
  To: Maciej W. Rozycki, Leon Alrae
  Cc: Yongbok Kim, peter.maydell, qemu-devel, afaerber

On 05/13/2015 02:21 PM, Maciej W. Rozycki wrote:
> On Wed, 13 May 2015, Leon Alrae wrote:
> 
>>> Certainly we do.  It's all in softmmu_template.h.
>>
>> I believe the problem is that MSA vector register's size is 16-bytes
>> (this DATA_SIZE isn't supported in softmmu_template) and MSA load/store
>> is supposed to be atomic.
> 
>  Not really AFAICT.  Here's what the specification says[1]:
> 
> "The vector load instruction is atomic at the element level with no 
> guaranteed ordering among elements, i.e. each element load is an atomic 
> operation issued in no particular order with respect to the element's 
> vector position."
> 
> and[2]:
> 
> "The vector store instruction is atomic at the element level with no 
> guaranteed ordering among elements, i.e. each element store is an atomic 
> operation issued in no particular order with respect to the element's 
> vector position."
> 
> so you only need to get atomic up to 8 bytes (with LD.D and ST.D, less 
> with the narrower vector elements), and that looks supported to me.

There's "atomic" in the transactional sense, and then there's "atomic" in the
visibility to other actors on the bus sense.

Presumably Leon is talking about the first, wherein we must ensure all writes
to both pages must succeed.  Which just means making sure that both pages are
present and writable before modifying any memory.

This patch set *is* doing too much work for loads, since the loads can simply
be issued, and the result written back to the register at the end after all
have succeeded.


r~

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-13 21:36             ` Richard Henderson
@ 2015-05-13 22:54               ` Maciej W. Rozycki
  2015-05-14  8:51                 ` Leon Alrae
  0 siblings, 1 reply; 22+ messages in thread
From: Maciej W. Rozycki @ 2015-05-13 22:54 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Yongbok Kim, peter.maydell, Leon Alrae, qemu-devel, afaerber

On Wed, 13 May 2015, Richard Henderson wrote:

> >> I believe the problem is that MSA vector register's size is 16-bytes
> >> (this DATA_SIZE isn't supported in softmmu_template) and MSA load/store
> >> is supposed to be atomic.
> > 
> >  Not really AFAICT.  Here's what the specification says[1]:
> > 
> > "The vector load instruction is atomic at the element level with no 
> > guaranteed ordering among elements, i.e. each element load is an atomic 
> > operation issued in no particular order with respect to the element's 
> > vector position."
> > 
> > and[2]:
> > 
> > "The vector store instruction is atomic at the element level with no 
> > guaranteed ordering among elements, i.e. each element store is an atomic 
> > operation issued in no particular order with respect to the element's 
> > vector position."
> > 
> > so you only need to get atomic up to 8 bytes (with LD.D and ST.D, less 
> > with the narrower vector elements), and that looks supported to me.
> 
> There's "atomic" in the transactional sense, and then there's "atomic" in the
> visibility to other actors on the bus sense.
> 
> Presumably Leon is talking about the first, wherein we must ensure all writes
> to both pages must succeed.  Which just means making sure that both pages are
> present and writable before modifying any memory.

 I don't think we have.  The specification is a bit unclear I must admit 
and it also defines the details of vector load and store operations as 
implementation dependent, so there's no further clarification.

 However any unaligned loads or stores that cross a data-bus-width 
boundary require two bus cycles to complete and therefore by definition 
are not atomic in the visibility to other actors on the bus sense.  
Therefore the only atomicity sense that can be considered here is I 
believe transactional, on the per-element basis as this is what the 
specification refers to.

 Then the exact semantics of loads and stores is left up to the 
implementer, so for example ST.H can be implemented as 2 
doubleword-store transactions, or 4 word-store transactions (that 
wouldn't be allowed with ST.D), or 8 halfword-store transactions (that 
wouldn't be allowed with ST.W), but not 16 byte-store transactions (that 
would be allowed with ST.B).

 Consequently I believe only individual vector element writes (or reads, 
for that matter) are required to either successfully complete or 
completely back out, and a TLB, an address error or a bus error 
exception (or perhaps a hardware interrupt exception even) happening in 
the middle of a vector load or store instruction may observe the 
destination vector register or memory respectively partially updated 
with elements already transferred (but not an individual element 
partially transferred).

 That would be consistent with what happens with the other multi-word 
transfer instructions I mentioned when they get interrupted on the way 
(yes, they do allow hardware interrupts to break them too) and likely 
easier to implement as well.

 That's just my intepretation though.  Perhaps the specification needs a 
further clarification.

  Maciej

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-13 22:54               ` Maciej W. Rozycki
@ 2015-05-14  8:51                 ` Leon Alrae
  2015-05-14 11:22                   ` Maciej W. Rozycki
  0 siblings, 1 reply; 22+ messages in thread
From: Leon Alrae @ 2015-05-14  8:51 UTC (permalink / raw)
  To: Maciej W. Rozycki, Richard Henderson
  Cc: Yongbok Kim, peter.maydell, qemu-devel, afaerber

On 13/05/2015 23:54, Maciej W. Rozycki wrote:
> On Wed, 13 May 2015, Richard Henderson wrote:
> 
>>>> I believe the problem is that MSA vector register's size is 16-bytes
>>>> (this DATA_SIZE isn't supported in softmmu_template) and MSA load/store
>>>> is supposed to be atomic.
>>>
>>>  Not really AFAICT.  Here's what the specification says[1]:
>>>
>>> "The vector load instruction is atomic at the element level with no 
>>> guaranteed ordering among elements, i.e. each element load is an atomic 
>>> operation issued in no particular order with respect to the element's 
>>> vector position."
>>>
>>> and[2]:
>>>
>>> "The vector store instruction is atomic at the element level with no 
>>> guaranteed ordering among elements, i.e. each element store is an atomic 
>>> operation issued in no particular order with respect to the element's 
>>> vector position."
>>>
>>> so you only need to get atomic up to 8 bytes (with LD.D and ST.D, less 
>>> with the narrower vector elements), and that looks supported to me.
>>
>> There's "atomic" in the transactional sense, and then there's "atomic" in the
>> visibility to other actors on the bus sense.
>>
>> Presumably Leon is talking about the first, wherein we must ensure all writes
>> to both pages must succeed.  Which just means making sure that both pages are
>> present and writable before modifying any memory.
> 
>  I don't think we have.  The specification is a bit unclear I must admit 
> and it also defines the details of vector load and store operations as 
> implementation dependent, so there's no further clarification.

This is specified in "MIPS Architecture For Programmers Volume I-A:
Introduction to the MIPS64 Architecture", Revision: 6.01, Imagination
Technologies, Document Number: MD00083, August 20, 2014, p.142:

"For example, a misaligned vector load instruction will never leave its
vector destination register half written, if part of a page split
succeeds and the other part takes an exception. It is either all done,
or not at all."

Leon

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-13 19:28   ` Richard Henderson
  2015-05-13 19:56     ` Maciej W. Rozycki
@ 2015-05-14  9:00     ` Yongbok Kim
  2015-05-14  9:46       ` Yongbok Kim
  2015-05-14  9:50     ` Leon Alrae
  2 siblings, 1 reply; 22+ messages in thread
From: Yongbok Kim @ 2015-05-14  9:00 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: peter.maydell, leon.alrae, afaerber

On 13/05/2015 20:28, Richard Henderson wrote:
> On 05/13/2015 08:37 AM, Yongbok Kim wrote:
>> +static inline void ensure_atomic_msa_block_access(CPUMIPSState *env,
>> +                                                  target_ulong addr,
>> +                                                  int rw,
>> +                                                  int mmu_idx)
>>  {
>> +#if !defined(CONFIG_USER_ONLY)
>> +#define MSA_PAGESPAN(x) (unlikely((((x) & ~TARGET_PAGE_MASK)                \
>> +                                   + MSA_WRLEN/8 - 1) >= TARGET_PAGE_SIZE))
>> +    CPUState *cs = CPU(mips_env_get_cpu(env));
>> +    target_ulong page_addr;
>>  
>> +    if (MSA_PAGESPAN(addr)) {
>> +        /* first page */
>> +        tlb_fill(cs, addr, rw, mmu_idx, 0);
>> +        /* second page */
>> +        page_addr = (addr & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
>> +        tlb_fill(cs, page_addr, rw, mmu_idx, 0);
>>      }
>> +#endif
>>  }
> This doesn't do quite what you think it does.  It does trap if the page isn't
> mapped at all, but it doesn't trap if e.g. rw is set and the page is read-only.
> That requires a subsequent check for what permissions were installed by
> tlb_set_page.

I must double check the behaviour but please note that here we are filling qemu's tlb entries
according to target's tlb entries. Therefore permission issue would be cleared.
I agree with your comment from later email that for the load this is too much as all load can
be issued and storing into the vector register can be followed.
I wasn't sure that because this tlb filling is happening only if an access is crossing the page boundary.

> I had thought there was a way to look this up besides duplicating the code in
> softmmu_template.h, but I suppose that's in a patch set that never made it in.
>
>
>> +    if (unlikely(addr & ((1 << DF) - 1))) {                             \
>> +        /* work-around for misaligned accesses */                       \
>> +        for (i = 0; i < DF_ELEMENTS(DF_BYTE); i++) {                    \
>> +            pwd->b[i] = do_lbu(env, addr + (i << DF_BYTE), mmu_idx);    \
>> +        }                                                               \
>> +        correct_vector_endianness_ ## TYPE(pwd, pwd);                   \
> Why byte accesses?  The softmmu helpers are guaranteed to support misalignment.
The reason to use byte-to-byte access here is because we need to distinguish misaligned ok instructions
and not. MIPS R5 doesn't support misaligned while MSA does. In the do_unaligned_access(),
it is quite hard to find the information out. This was the trigger of your patch.
Since I haven't got an indication of your work, I took the work-around here to avoid the problem.

> As an aside, consider moving away from
>
> #define HELPER_LD(name, insn, type)                                     \
> static inline type do_##name(CPUMIPSState *env, target_ulong addr,      \
>                              int mem_idx)                               \
> {                                                                       \
>     switch (mem_idx)                                                    \
>     {                                                                   \
>     case 0: return (type) cpu_##insn##_kernel(env, addr); break;        \
>     case 1: return (type) cpu_##insn##_super(env, addr); break;         \
>     default:                                                            \
>     case 2: return (type) cpu_##insn##_user(env, addr); break;          \
>     }                                                                   \
> }
>
> to using helper_ret_*_mmu directly.  Which allows you to specify the mmu_idx
> directly rather than bouncing around different thunks.  It also allows you to
> pass in GETRA(), which would allow these helpers to use cpu_restore_state on
> faults.
>
Agreed.
> r~

Regards,
Yongbok

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-14  9:00     ` Yongbok Kim
@ 2015-05-14  9:46       ` Yongbok Kim
  2015-05-14 18:44         ` Richard Henderson
  0 siblings, 1 reply; 22+ messages in thread
From: Yongbok Kim @ 2015-05-14  9:46 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: peter.maydell, leon.alrae, afaerber

On 14/05/2015 10:00, Yongbok Kim wrote:
> On 13/05/2015 20:28, Richard Henderson wrote:
>> On 05/13/2015 08:37 AM, Yongbok Kim wrote:
>>> +static inline void ensure_atomic_msa_block_access(CPUMIPSState *env,
>>> +                                                  target_ulong addr,
>>> +                                                  int rw,
>>> +                                                  int mmu_idx)
>>>  {
>>> +#if !defined(CONFIG_USER_ONLY)
>>> +#define MSA_PAGESPAN(x) (unlikely((((x) & ~TARGET_PAGE_MASK)                \
>>> +                                   + MSA_WRLEN/8 - 1) >= TARGET_PAGE_SIZE))
>>> +    CPUState *cs = CPU(mips_env_get_cpu(env));
>>> +    target_ulong page_addr;
>>>  
>>> +    if (MSA_PAGESPAN(addr)) {
>>> +        /* first page */
>>> +        tlb_fill(cs, addr, rw, mmu_idx, 0);
>>> +        /* second page */
>>> +        page_addr = (addr & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
>>> +        tlb_fill(cs, page_addr, rw, mmu_idx, 0);
>>>      }
>>> +#endif
>>>  }
>> This doesn't do quite what you think it does.  It does trap if the page isn't
>> mapped at all, but it doesn't trap if e.g. rw is set and the page is read-only.
>> That requires a subsequent check for what permissions were installed by
>> tlb_set_page.
> I must double check the behaviour but please note that here we are filling qemu's tlb entries
> according to target's tlb entries. Therefore permission issue would be cleared.
> I agree with your comment from later email that for the load this is too much as all load can
> be issued and storing into the vector register can be followed.
> I wasn't sure that because this tlb filling is happening only if an access is crossing the page boundary.
>
>
In addition to that, if we issue all the loads let say only the first page is
accessible, in the architectural point of view it would be fine as nothing will
be stored in the vector register but accessing the first page is "visible" from
the data bus.
Do you think this wouldn't cause any problem?
It might be just implementation dependent though.

Regards,
Yongbok

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-13 19:28   ` Richard Henderson
  2015-05-13 19:56     ` Maciej W. Rozycki
  2015-05-14  9:00     ` Yongbok Kim
@ 2015-05-14  9:50     ` Leon Alrae
  2015-05-14 15:27       ` Richard Henderson
  2 siblings, 1 reply; 22+ messages in thread
From: Leon Alrae @ 2015-05-14  9:50 UTC (permalink / raw)
  To: Richard Henderson, Yongbok Kim, qemu-devel; +Cc: peter.maydell, afaerber

On 13/05/2015 20:28, Richard Henderson wrote:
> As an aside, consider moving away from
> 
> #define HELPER_LD(name, insn, type)                                     \
> static inline type do_##name(CPUMIPSState *env, target_ulong addr,      \
>                              int mem_idx)                               \
> {                                                                       \
>     switch (mem_idx)                                                    \
>     {                                                                   \
>     case 0: return (type) cpu_##insn##_kernel(env, addr); break;        \
>     case 1: return (type) cpu_##insn##_super(env, addr); break;         \
>     default:                                                            \
>     case 2: return (type) cpu_##insn##_user(env, addr); break;          \
>     }                                                                   \
> }
> 
> to using helper_ret_*_mmu directly.  Which allows you to specify the mmu_idx
> directly rather than bouncing around different thunks.  It also allows you to
> pass in GETRA(), which would allow these helpers to use cpu_restore_state on
> faults.

Just to confirm -– before using helper_ret_*_mmu directly we should also
check if we can take fast-path (not sure if “fast-path” is correct term
in this case as we've already generated a call to helper function...):

    if (unlikely(env->tlb_table[mmu_idx][page_index].ADDR_READ !=
                 (addr & (TARGET_PAGE_MASK | (DATA_SIZE - 1))))) {

So basically we'll have similar functions to cpu_##insn##_* but allowing
to pass mmu_idx, GETRA() and calling helper_ret_*_mmu directly.

BTW what is the reason that we aren't passing GETRA() to cpu_##insn##_*
and using helper_ret_*_mmu directly in general?

Thanks,
Leon

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-14  8:51                 ` Leon Alrae
@ 2015-05-14 11:22                   ` Maciej W. Rozycki
  0 siblings, 0 replies; 22+ messages in thread
From: Maciej W. Rozycki @ 2015-05-14 11:22 UTC (permalink / raw)
  To: Leon Alrae
  Cc: Yongbok Kim, peter.maydell, qemu-devel, afaerber, Richard Henderson

On Thu, 14 May 2015, Leon Alrae wrote:

> >  I don't think we have.  The specification is a bit unclear I must admit 
> > and it also defines the details of vector load and store operations as 
> > implementation dependent, so there's no further clarification.
> 
> This is specified in "MIPS Architecture For Programmers Volume I-A:
> Introduction to the MIPS64 Architecture", Revision: 6.01, Imagination
> Technologies, Document Number: MD00083, August 20, 2014, p.142:
> 
> "For example, a misaligned vector load instruction will never leave its
> vector destination register half written, if part of a page split
> succeeds and the other part takes an exception. It is either all done,
> or not at all."

 Thanks.  It's good to see r6 has clarified the matters around here and 
I think we can (and for simplicity ought to) apply them to r5 too.

  Maciej

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-14  9:50     ` Leon Alrae
@ 2015-05-14 15:27       ` Richard Henderson
  2015-05-14 19:12         ` Richard Henderson
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2015-05-14 15:27 UTC (permalink / raw)
  To: Leon Alrae, Yongbok Kim, qemu-devel; +Cc: peter.maydell, afaerber

On 05/14/2015 02:50 AM, Leon Alrae wrote:
> Just to confirm -– before using helper_ret_*_mmu directly we should also
> check if we can take fast-path (not sure if “fast-path” is correct term
> in this case as we've already generated a call to helper function...):
> 
>     if (unlikely(env->tlb_table[mmu_idx][page_index].ADDR_READ !=
>                  (addr & (TARGET_PAGE_MASK | (DATA_SIZE - 1))))) {

Nearly.  You don't want the (DATA_SIZE - 1) part -- that exists to catch
misalignment.  You do need the test from helper_le_st_name:


    target_ulong tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
    if ((addr & TARGET_PAGE_MASK)
        != (tlb_addr & (TARGET_PAGE_MASK | TLB_INVALID_MASK))) {
        if (!VICTIM_TLB_HIT(addr_write)) {
            tlb_fill(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr);
        }
    }

And before we spread that code around too much, I think we ought to create
another helper.

Perhaps

  void probe_read(CPUArchState *env, target_ulong addr, int mmu_idx,
                            uintptr_t retaddr);

  void probe_write(CPUArchState *env, target_ulong addr, int mmu_idx,
                            uintptr_t retaddr);

> BTW what is the reason that we aren't passing GETRA() to cpu_##insn##_*
> and using helper_ret_*_mmu directly in general?

Because cpu_insn_* is the older interface which hasn't been removed yet.


r~

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-14  9:46       ` Yongbok Kim
@ 2015-05-14 18:44         ` Richard Henderson
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Henderson @ 2015-05-14 18:44 UTC (permalink / raw)
  To: Yongbok Kim, qemu-devel; +Cc: peter.maydell, leon.alrae, afaerber

On 05/14/2015 02:46 AM, Yongbok Kim wrote:
> In addition to that, if we issue all the loads let say only the first page is
> accessible, in the architectural point of view it would be fine as nothing will
> be stored in the vector register but accessing the first page is "visible" from
> the data bus.
> Do you think this wouldn't cause any problem?
> It might be just implementation dependent though.

I don't think it would cause a problem unless the user is silly enough to issue
an MSA read to device memory.


r~

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-14 15:27       ` Richard Henderson
@ 2015-05-14 19:12         ` Richard Henderson
  2015-05-15 12:09           ` Leon Alrae
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2015-05-14 19:12 UTC (permalink / raw)
  To: Leon Alrae, Yongbok Kim, qemu-devel
  Cc: Peter Maydell, afaerber, Paolo Bonzini

On 05/14/2015 08:27 AM, Richard Henderson wrote:
> Perhaps
> 
>   void probe_read(CPUArchState *env, target_ulong addr, int mmu_idx,
>                             uintptr_t retaddr);
> 
>   void probe_write(CPUArchState *env, target_ulong addr, int mmu_idx,
>                             uintptr_t retaddr);

Alternately, return the host address and then we're mostly overlapped with
tlb_vaddr_to_host.  Which was the function I was trying to remember earlier,
but doesn't *quite* do what I hoped.

What tlb_vaddr_to_host doesn't do is force a tlb_fill when the page in question
isn't in the tlb.  The helper dc_zva uses a subsequent store to force that.

I do wonder if the arm helper might be better written as

  uint64_t blocksize = ...
  uint64_t writesize = MIN(blocksize, TARGET_PAGE_SIZE);

  for (ofs = 0; ofs < blocklen; ofs += writesize) {
    hostaddr = probe_write(env, vaddr + ofs, mmu_idx, GETRA());
    if (hostaddr != NULL) {
        memset(hostaddr, 0, MIN(blocksize, writesize);
    } else {
        /* Since we didn't trap out of probe_write, the map is present
           and writable, but isn't RAM.  Do a series of byte writes as
           the architecture demands.  */
        for (i = 0; i < writesize; ++i) {
            helper_ret_stb_mmu(env, vaddr + ofs + i, 0, oi, GETRA());
        }
    }

Which does have different properties wrt the size of the memset in currently
unused cases of very large blocksize.  And probably the case of notdirty or
watchpointed ram as well.

For the case of MIPS under discussion, we could write this as

  baddr = probe_write(env, addr, mmu_idx, GETRA());
  eaddr = probe_write(env, addr + 15, mmu_idx, GETRA());

  /* We know both pages are present and writable.  */
  if (eaddr == baddr + 15) {
      /* Consecutive pages in RAM.  */
      memcpy(baddr, register, 16);
  } else {
      /* Someone's doing an MSA store to device memory.  */
      for (i = 0; i < 2; ++i) {
          helper_ret_stq_mmu(env, vaddr + i*8, register.d[0],
                             make_memop_idx(MO_UNALN | MO_TEQ, mmu_idx),
                             GETRA());
      }
  }

Thoughts?


r~

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-14 19:12         ` Richard Henderson
@ 2015-05-15 12:09           ` Leon Alrae
  2015-05-15 13:43             ` Richard Henderson
  0 siblings, 1 reply; 22+ messages in thread
From: Leon Alrae @ 2015-05-15 12:09 UTC (permalink / raw)
  To: Richard Henderson, Yongbok Kim, qemu-devel
  Cc: Peter Maydell, afaerber, Paolo Bonzini

On 14/05/2015 20:12, Richard Henderson wrote:
>   /* We know both pages are present and writable.  */
>   if (eaddr == baddr + 15) {
>       /* Consecutive pages in RAM.  */
>       memcpy(baddr, register, 16);
>   } else {
>       /* Someone's doing an MSA store to device memory.  */
>       for (i = 0; i < 2; ++i) {
>           helper_ret_stq_mmu(env, vaddr + i*8, register.d[0],
>                              make_memop_idx(MO_UNALN | MO_TEQ, mmu_idx),
>                              GETRA());
>       }
>   }

We would additionally need to take care of vector elements' endianness
before using this code. Therefore always using helper_ret_st[bwlq]_mmu
(depending on data format) in a loop probably is simpler.

Leon

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-15 12:09           ` Leon Alrae
@ 2015-05-15 13:43             ` Richard Henderson
  2015-05-15 14:04               ` Leon Alrae
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Henderson @ 2015-05-15 13:43 UTC (permalink / raw)
  To: Leon Alrae, Yongbok Kim, qemu-devel
  Cc: Peter Maydell, afaerber, Paolo Bonzini

On 05/15/2015 05:09 AM, Leon Alrae wrote:
> On 14/05/2015 20:12, Richard Henderson wrote:
>>   /* We know both pages are present and writable.  */
>>   if (eaddr == baddr + 15) {
>>       /* Consecutive pages in RAM.  */
>>       memcpy(baddr, register, 16);
>>   } else {
>>       /* Someone's doing an MSA store to device memory.  */
>>       for (i = 0; i < 2; ++i) {
>>           helper_ret_stq_mmu(env, vaddr + i*8, register.d[0],
>>                              make_memop_idx(MO_UNALN | MO_TEQ, mmu_idx),
>>                              GETRA());
>>       }
>>   }
> 
> We would additionally need to take care of vector elements' endianness
> before using this code. Therefore always using helper_ret_st[bwlq]_mmu
> (depending on data format) in a loop probably is simpler.

I suppose.  I'd thought one of the patches defined functions to do the
swapping, so I assumed that would still be used.


r~

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA
  2015-05-15 13:43             ` Richard Henderson
@ 2015-05-15 14:04               ` Leon Alrae
  0 siblings, 0 replies; 22+ messages in thread
From: Leon Alrae @ 2015-05-15 14:04 UTC (permalink / raw)
  To: Richard Henderson, Yongbok Kim, qemu-devel
  Cc: Peter Maydell, afaerber, Paolo Bonzini

On 15/05/2015 14:43, Richard Henderson wrote:
> On 05/15/2015 05:09 AM, Leon Alrae wrote:
>> On 14/05/2015 20:12, Richard Henderson wrote:
>>>   /* We know both pages are present and writable.  */
>>>   if (eaddr == baddr + 15) {
>>>       /* Consecutive pages in RAM.  */
>>>       memcpy(baddr, register, 16);
>>>   } else {
>>>       /* Someone's doing an MSA store to device memory.  */
>>>       for (i = 0; i < 2; ++i) {
>>>           helper_ret_stq_mmu(env, vaddr + i*8, register.d[0],
>>>                              make_memop_idx(MO_UNALN | MO_TEQ, mmu_idx),
>>>                              GETRA());
>>>       }
>>>   }
>>
>> We would additionally need to take care of vector elements' endianness
>> before using this code. Therefore always using helper_ret_st[bwlq]_mmu
>> (depending on data format) in a loop probably is simpler.
> 
> I suppose.  I'd thought one of the patches defined functions to do the
> swapping, so I assumed that would still be used.

IIUC they were required because of the byte access workaround, but now
with the MO_UNALN support in TCG (thanks for this!) neither the
workaround nor the swapping is needed for MSA I think.

Leon

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-05-15 14:04 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-13 15:37 [Qemu-devel] [PATCH v3 0/2] target-mips: Add support for misaligned accesses Yongbok Kim
2015-05-13 15:37 ` [Qemu-devel] [PATCH v3 1/2] target-mips: Misaligned memory accesses for R6 Yongbok Kim
2015-05-13 15:37 ` [Qemu-devel] [PATCH v3 2/2] target-mips: Misaligned memory accesses for MSA Yongbok Kim
2015-05-13 19:28   ` Richard Henderson
2015-05-13 19:56     ` Maciej W. Rozycki
2015-05-13 19:58       ` Richard Henderson
2015-05-13 20:59         ` Leon Alrae
2015-05-13 21:21           ` Maciej W. Rozycki
2015-05-13 21:36             ` Richard Henderson
2015-05-13 22:54               ` Maciej W. Rozycki
2015-05-14  8:51                 ` Leon Alrae
2015-05-14 11:22                   ` Maciej W. Rozycki
2015-05-13 21:31           ` Richard Henderson
2015-05-14  9:00     ` Yongbok Kim
2015-05-14  9:46       ` Yongbok Kim
2015-05-14 18:44         ` Richard Henderson
2015-05-14  9:50     ` Leon Alrae
2015-05-14 15:27       ` Richard Henderson
2015-05-14 19:12         ` Richard Henderson
2015-05-15 12:09           ` Leon Alrae
2015-05-15 13:43             ` Richard Henderson
2015-05-15 14:04               ` Leon Alrae

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.