All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] tcg: Issue memory barriers for guest memory model
@ 2021-03-16 22:07 Richard Henderson
  2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

This is intending to fix the current aarch64 host failure
for s390x guest cdrom-test.  This is caused by the io thread
issuing memory barriers that are supposed to be matched by
the vcpu, but are elided by tcg in rr mode as "unnecessary".

I know Peter would like a smaller patch to sync the io thread
with the vcpu thread.  I've made a couple of attempts at this,
but havn't managed to get something reliable (although now
irritatingly infrequent -- about 1 in 500).

I have further patches to further optimize barriers, and to
generate load-acquire/store-release instructions in tcg.
But it's late in the release cycle, etc etc.

I've done nothing to measure the performance impact of this.
I quit the cdtom-test cycle after 4000 passes.


r~


Richard Henderson (5):
  tcg: Decode the operand to INDEX_op_mb in dumps
  tcg: Do not elide memory barriers for CF_PARALLEL
  tcg: Elide memory barriers implied by the host memory model
  tcg: Create tcg_req_mo
  tcg: Add host memory barriers to cpu_ldst.h interfaces

 include/exec/cpu_ldst.h |  7 ++++
 include/tcg/tcg.h       | 20 +++++++++++
 accel/tcg/cputlb.c      |  2 ++
 accel/tcg/tcg-all.c     |  6 +---
 accel/tcg/user-exec.c   | 17 +++++++++
 tcg/tcg-op.c            | 19 +++++-----
 tcg/tcg.c               | 79 +++++++++++++++++++++++++++++++++++++++++
 7 files changed, 137 insertions(+), 13 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps
  2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
@ 2021-03-16 22:07 ` Richard Henderson
  2021-03-17 13:32   ` Philippe Mathieu-Daudé
  2021-03-16 22:07 ` [PATCH 2/5] tcg: Do not elide memory barriers for CF_PARALLEL Richard Henderson
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 9+ messages in thread
From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/tcg.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 2991112829..23a94d771c 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2415,6 +2415,85 @@ static void tcg_dump_ops(TCGContext *s, bool have_prefs)
                                 arg_label(op->args[k])->id);
                 i++, k++;
                 break;
+            case INDEX_op_mb:
+                {
+                    TCGBar membar = op->args[k];
+                    const char *b_op, *m_op;
+
+                    switch (membar & TCG_BAR_SC) {
+                    case 0:
+                        b_op = "none";
+                        break;
+                    case TCG_BAR_LDAQ:
+                        b_op = "acq";
+                        break;
+                    case TCG_BAR_STRL:
+                        b_op = "rel";
+                        break;
+                    case TCG_BAR_SC:
+                        b_op = "seq";
+                        break;
+                    default:
+                        g_assert_not_reached();
+                    }
+
+                    switch (membar & TCG_MO_ALL) {
+                    case 0:
+                        m_op = "none";
+                        break;
+                    case TCG_MO_LD_LD:
+                        m_op = "rr";
+                        break;
+                    case TCG_MO_LD_ST:
+                        m_op = "rw";
+                        break;
+                    case TCG_MO_ST_LD:
+                        m_op = "wr";
+                        break;
+                    case TCG_MO_ST_ST:
+                        m_op = "ww";
+                        break;
+                    case TCG_MO_LD_LD | TCG_MO_LD_ST:
+                        m_op = "rr+rw";
+                        break;
+                    case TCG_MO_LD_LD | TCG_MO_ST_LD:
+                        m_op = "rr+wr";
+                        break;
+                    case TCG_MO_LD_LD | TCG_MO_ST_ST:
+                        m_op = "rr+ww";
+                        break;
+                    case TCG_MO_LD_ST | TCG_MO_ST_LD:
+                        m_op = "rw+wr";
+                        break;
+                    case TCG_MO_LD_ST | TCG_MO_ST_ST:
+                        m_op = "rw+ww";
+                        break;
+                    case TCG_MO_ST_LD | TCG_MO_ST_ST:
+                        m_op = "wr+ww";
+                        break;
+                    case TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_LD:
+                        m_op = "rr+rw+wr";
+                        break;
+                    case TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST:
+                        m_op = "rr+rw+ww";
+                        break;
+                    case TCG_MO_LD_LD | TCG_MO_ST_LD | TCG_MO_ST_ST:
+                        m_op = "rr+wr+ww";
+                        break;
+                    case TCG_MO_LD_ST | TCG_MO_ST_LD | TCG_MO_ST_ST:
+                        m_op = "rw+wr+ww";
+                        break;
+                    case TCG_MO_ALL:
+                        m_op = "all";
+                        break;
+                    default:
+                        g_assert_not_reached();
+                    }
+
+                    col += qemu_log("%s%s:%s", (k ? "," : ""), b_op, m_op);
+                    i++, k++;
+                }
+                break;
             default:
                 break;
             }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/5] tcg: Do not elide memory barriers for CF_PARALLEL
  2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
  2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson
@ 2021-03-16 22:07 ` Richard Henderson
  2021-03-16 22:07 ` [PATCH 3/5] tcg: Elide memory barriers implied by the host memory model Richard Henderson
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

The virtio devices require proper memory ordering between
the vcpus and the iothreads.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/tcg-op.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 70475773f4..76dc7d8dc5 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -97,9 +97,13 @@ void tcg_gen_op6(TCGOpcode opc, TCGArg a1, TCGArg a2, TCGArg a3,
 
 void tcg_gen_mb(TCGBar mb_type)
 {
-    if (tcg_ctx->tb_cflags & CF_PARALLEL) {
-        tcg_gen_op1(INDEX_op_mb, mb_type);
-    }
+    /*
+     * It is tempting to elide the barrier in a single-threaded context
+     * (i.e. !(tb_cflags & CF_PARALLEL)), however, even with a single cpu
+     * we have i/o threads running in parallel, and lack of memory order
+     * can result in e.g. virtio queue entries being read incorrectly.
+     */
+    tcg_gen_op1(INDEX_op_mb, mb_type);
 }
 
 /* 32 bit ops */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/5] tcg: Elide memory barriers implied by the host memory model
  2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
  2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson
  2021-03-16 22:07 ` [PATCH 2/5] tcg: Do not elide memory barriers for CF_PARALLEL Richard Henderson
@ 2021-03-16 22:07 ` Richard Henderson
  2021-03-16 22:07 ` [PATCH 4/5] tcg: Create tcg_req_mo Richard Henderson
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Reduce the set of required barriers to those needed by
the host right from the beginning.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/tcg-op.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 76dc7d8dc5..c8501508c2 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -102,8 +102,13 @@ void tcg_gen_mb(TCGBar mb_type)
      * (i.e. !(tb_cflags & CF_PARALLEL)), however, even with a single cpu
      * we have i/o threads running in parallel, and lack of memory order
      * can result in e.g. virtio queue entries being read incorrectly.
+     *
+     * That said, we can elide anything which the host provides for free.
      */
-    tcg_gen_op1(INDEX_op_mb, mb_type);
+    mb_type &= ~TCG_TARGET_DEFAULT_MO;
+    if (mb_type & TCG_MO_ALL) {
+        tcg_gen_op1(INDEX_op_mb, mb_type);
+    }
 }
 
 /* 32 bit ops */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/5] tcg: Create tcg_req_mo
  2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
                   ` (2 preceding siblings ...)
  2021-03-16 22:07 ` [PATCH 3/5] tcg: Elide memory barriers implied by the host memory model Richard Henderson
@ 2021-03-16 22:07 ` Richard Henderson
  2021-03-16 22:07 ` [PATCH 5/5] tcg: Add host memory barriers to cpu_ldst.h interfaces Richard Henderson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Split out the logic to emit a host memory barrier in response to
a guest memory operation.  Do not provide a true default for
TCG_GUEST_DEFAULT_MO because the defined() check will still be
useful for determining if a guest has been updated for MTTCG.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/tcg/tcg.h   | 20 ++++++++++++++++++++
 accel/tcg/tcg-all.c |  6 +-----
 tcg/tcg-op.c        |  8 +-------
 3 files changed, 22 insertions(+), 12 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 0f0695e90d..395b3b6964 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -1245,6 +1245,26 @@ static inline unsigned get_mmuidx(TCGMemOpIdx oi)
     return oi & 15;
 }
 
+/**
+ * tcg_req_mo:
+ * @type: TCGBar
+ *
+ * Filter @type to the barrier that is required for the guest
+ * memory ordering vs the host memory ordering.  A non-zero
+ * result indicates that some barrier is required.
+ *
+ * If TCG_GUEST_DEFAULT_MO is not defined, assume that the
+ * guest requires strict alignment.
+ *
+ * This is a macro so that it's constant even without optimization.
+ */
+#ifdef TCG_GUEST_DEFAULT_MO
+# define tcg_req_mo(type) \
+    ((type) & TCG_GUEST_DEFAULT_MO & ~TCG_TARGET_DEFAULT_MO)
+#else
+# define tcg_req_mo(type) ((type) & ~TCG_TARGET_DEFAULT_MO)
+#endif
+
 /**
  * tcg_qemu_tb_exec:
  * @env: pointer to CPUArchState for the CPU
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index e378c2db73..6ae51e3476 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -69,11 +69,7 @@ DECLARE_INSTANCE_CHECKER(TCGState, TCG_STATE,
 
 static bool check_tcg_memory_orders_compatible(void)
 {
-#if defined(TCG_GUEST_DEFAULT_MO) && defined(TCG_TARGET_DEFAULT_MO)
-    return (TCG_GUEST_DEFAULT_MO & ~TCG_TARGET_DEFAULT_MO) == 0;
-#else
-    return false;
-#endif
+    return tcg_req_mo(TCG_MO_ALL) == 0;
 }
 
 static bool default_mttcg_enabled(void)
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index c8501508c2..12fc8a1b17 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2796,13 +2796,7 @@ static void gen_ldst_i64(TCGOpcode opc, TCGv_i64 val, TCGv addr,
 
 static void tcg_gen_req_mo(TCGBar type)
 {
-#ifdef TCG_GUEST_DEFAULT_MO
-    type &= TCG_GUEST_DEFAULT_MO;
-#endif
-    type &= ~TCG_TARGET_DEFAULT_MO;
-    if (type) {
-        tcg_gen_mb(type | TCG_BAR_SC);
-    }
+    tcg_gen_mb(tcg_req_mo(type) | TCG_BAR_SC);
 }
 
 static inline TCGv plugin_prep_mem_callbacks(TCGv vaddr)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/5] tcg: Add host memory barriers to cpu_ldst.h interfaces
  2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
                   ` (3 preceding siblings ...)
  2021-03-16 22:07 ` [PATCH 4/5] tcg: Create tcg_req_mo Richard Henderson
@ 2021-03-16 22:07 ` Richard Henderson
  2021-03-16 22:30 ` [PATCH 0/5] tcg: Issue memory barriers for guest memory model no-reply
  2021-06-26  6:06 ` Richard Henderson
  6 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Bring the majority of helpers into line with the rest of
tcg in respecting guest memory ordering.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/cpu_ldst.h |  7 +++++++
 accel/tcg/cputlb.c      |  2 ++
 accel/tcg/user-exec.c   | 17 +++++++++++++++++
 3 files changed, 26 insertions(+)

diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h
index ce6ce82618..f0ab79fe3c 100644
--- a/include/exec/cpu_ldst.h
+++ b/include/exec/cpu_ldst.h
@@ -169,6 +169,13 @@ void cpu_stl_le_data_ra(CPUArchState *env, abi_ptr ptr,
 void cpu_stq_le_data_ra(CPUArchState *env, abi_ptr ptr,
                         uint64_t val, uintptr_t ra);
 
+#define cpu_req_mo(type)          \
+    do {                          \
+        if (tcg_req_mo(type)) {   \
+            smp_mb();             \
+        }                         \
+    } while (0)
+
 #if defined(CONFIG_USER_ONLY)
 
 extern __thread uintptr_t helper_retaddr;
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 8a7b779270..a3503eaa71 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2100,6 +2100,7 @@ static inline uint64_t cpu_load_helper(CPUArchState *env, abi_ptr addr,
     meminfo = trace_mem_get_info(op, mmu_idx, false);
     trace_guest_mem_before_exec(env_cpu(env), addr, meminfo);
 
+    cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     op &= ~MO_SIGN;
     oi = make_memop_idx(op, mmu_idx);
     ret = full_load(env, addr, oi, retaddr);
@@ -2542,6 +2543,7 @@ cpu_store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
     meminfo = trace_mem_get_info(op, mmu_idx, true);
     trace_guest_mem_before_exec(env_cpu(env), addr, meminfo);
 
+    cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
     oi = make_memop_idx(op, mmu_idx);
     store_helper(env, addr, val, oi, retaddr, op);
 
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 0d8cc27b21..34f6dfcef4 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -843,6 +843,7 @@ uint32_t cpu_ldub_data(CPUArchState *env, abi_ptr ptr)
     uint16_t meminfo = trace_mem_get_info(MO_UB, MMU_USER_IDX, false);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     ret = ldub_p(g2h(env_cpu(env), ptr));
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
     return ret;
@@ -854,6 +855,7 @@ int cpu_ldsb_data(CPUArchState *env, abi_ptr ptr)
     uint16_t meminfo = trace_mem_get_info(MO_SB, MMU_USER_IDX, false);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     ret = ldsb_p(g2h(env_cpu(env), ptr));
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
     return ret;
@@ -865,6 +867,7 @@ uint32_t cpu_lduw_be_data(CPUArchState *env, abi_ptr ptr)
     uint16_t meminfo = trace_mem_get_info(MO_BEUW, MMU_USER_IDX, false);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     ret = lduw_be_p(g2h(env_cpu(env), ptr));
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
     return ret;
@@ -876,6 +879,7 @@ int cpu_ldsw_be_data(CPUArchState *env, abi_ptr ptr)
     uint16_t meminfo = trace_mem_get_info(MO_BESW, MMU_USER_IDX, false);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     ret = ldsw_be_p(g2h(env_cpu(env), ptr));
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
     return ret;
@@ -887,6 +891,7 @@ uint32_t cpu_ldl_be_data(CPUArchState *env, abi_ptr ptr)
     uint16_t meminfo = trace_mem_get_info(MO_BEUL, MMU_USER_IDX, false);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     ret = ldl_be_p(g2h(env_cpu(env), ptr));
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
     return ret;
@@ -898,6 +903,7 @@ uint64_t cpu_ldq_be_data(CPUArchState *env, abi_ptr ptr)
     uint16_t meminfo = trace_mem_get_info(MO_BEQ, MMU_USER_IDX, false);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     ret = ldq_be_p(g2h(env_cpu(env), ptr));
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
     return ret;
@@ -909,6 +915,7 @@ uint32_t cpu_lduw_le_data(CPUArchState *env, abi_ptr ptr)
     uint16_t meminfo = trace_mem_get_info(MO_LEUW, MMU_USER_IDX, false);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     ret = lduw_le_p(g2h(env_cpu(env), ptr));
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
     return ret;
@@ -920,6 +927,7 @@ int cpu_ldsw_le_data(CPUArchState *env, abi_ptr ptr)
     uint16_t meminfo = trace_mem_get_info(MO_LESW, MMU_USER_IDX, false);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     ret = ldsw_le_p(g2h(env_cpu(env), ptr));
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
     return ret;
@@ -931,6 +939,7 @@ uint32_t cpu_ldl_le_data(CPUArchState *env, abi_ptr ptr)
     uint16_t meminfo = trace_mem_get_info(MO_LEUL, MMU_USER_IDX, false);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     ret = ldl_le_p(g2h(env_cpu(env), ptr));
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
     return ret;
@@ -942,6 +951,7 @@ uint64_t cpu_ldq_le_data(CPUArchState *env, abi_ptr ptr)
     uint16_t meminfo = trace_mem_get_info(MO_LEQ, MMU_USER_IDX, false);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
     ret = ldq_le_p(g2h(env_cpu(env), ptr));
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
     return ret;
@@ -1052,6 +1062,7 @@ void cpu_stb_data(CPUArchState *env, abi_ptr ptr, uint32_t val)
     uint16_t meminfo = trace_mem_get_info(MO_UB, MMU_USER_IDX, true);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
     stb_p(g2h(env_cpu(env), ptr), val);
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
 }
@@ -1061,6 +1072,7 @@ void cpu_stw_be_data(CPUArchState *env, abi_ptr ptr, uint32_t val)
     uint16_t meminfo = trace_mem_get_info(MO_BEUW, MMU_USER_IDX, true);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
     stw_be_p(g2h(env_cpu(env), ptr), val);
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
 }
@@ -1070,6 +1082,7 @@ void cpu_stl_be_data(CPUArchState *env, abi_ptr ptr, uint32_t val)
     uint16_t meminfo = trace_mem_get_info(MO_BEUL, MMU_USER_IDX, true);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
     stl_be_p(g2h(env_cpu(env), ptr), val);
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
 }
@@ -1079,6 +1092,7 @@ void cpu_stq_be_data(CPUArchState *env, abi_ptr ptr, uint64_t val)
     uint16_t meminfo = trace_mem_get_info(MO_BEQ, MMU_USER_IDX, true);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
     stq_be_p(g2h(env_cpu(env), ptr), val);
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
 }
@@ -1088,6 +1102,7 @@ void cpu_stw_le_data(CPUArchState *env, abi_ptr ptr, uint32_t val)
     uint16_t meminfo = trace_mem_get_info(MO_LEUW, MMU_USER_IDX, true);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
     stw_le_p(g2h(env_cpu(env), ptr), val);
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
 }
@@ -1097,6 +1112,7 @@ void cpu_stl_le_data(CPUArchState *env, abi_ptr ptr, uint32_t val)
     uint16_t meminfo = trace_mem_get_info(MO_LEUL, MMU_USER_IDX, true);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
     stl_le_p(g2h(env_cpu(env), ptr), val);
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
 }
@@ -1106,6 +1122,7 @@ void cpu_stq_le_data(CPUArchState *env, abi_ptr ptr, uint64_t val)
     uint16_t meminfo = trace_mem_get_info(MO_LEQ, MMU_USER_IDX, true);
 
     trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+    cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
     stq_le_p(g2h(env_cpu(env), ptr), val);
     qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/5] tcg: Issue memory barriers for guest memory model
  2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
                   ` (4 preceding siblings ...)
  2021-03-16 22:07 ` [PATCH 5/5] tcg: Add host memory barriers to cpu_ldst.h interfaces Richard Henderson
@ 2021-03-16 22:30 ` no-reply
  2021-06-26  6:06 ` Richard Henderson
  6 siblings, 0 replies; 9+ messages in thread
From: no-reply @ 2021-03-16 22:30 UTC (permalink / raw)
  To: richard.henderson; +Cc: peter.maydell, qemu-devel

Patchew URL: https://patchew.org/QEMU/20210316220735.2048137-1-richard.henderson@linaro.org/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210316220735.2048137-1-richard.henderson@linaro.org
Subject: [PATCH 0/5] tcg: Issue memory barriers for guest memory model

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]      patchew/20210311143958.562625-1-richard.henderson@linaro.org -> patchew/20210311143958.562625-1-richard.henderson@linaro.org
 * [new tag]         patchew/20210316220735.2048137-1-richard.henderson@linaro.org -> patchew/20210316220735.2048137-1-richard.henderson@linaro.org
Switched to a new branch 'test'
06ceb5a tcg: Add host memory barriers to cpu_ldst.h interfaces
be4ade5 tcg: Create tcg_req_mo
1336778 tcg: Elide memory barriers implied by the host memory model
d0f90d5 tcg: Do not elide memory barriers for CF_PARALLEL
c9f634b tcg: Decode the operand to INDEX_op_mb in dumps

=== OUTPUT BEGIN ===
1/5 Checking commit c9f634bdbe20 (tcg: Decode the operand to INDEX_op_mb in dumps)
2/5 Checking commit d0f90d584f17 (tcg: Do not elide memory barriers for CF_PARALLEL)
3/5 Checking commit 133677838f14 (tcg: Elide memory barriers implied by the host memory model)
4/5 Checking commit be4ade51a457 (tcg: Create tcg_req_mo)
5/5 Checking commit 06ceb5ad212a (tcg: Add host memory barriers to cpu_ldst.h interfaces)
ERROR: memory barrier without comment
#189: FILE: include/exec/cpu_ldst.h:175:
+            smp_mb();             \

total: 1 errors, 0 warnings, 146 lines checked

Patch 5/5 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20210316220735.2048137-1-richard.henderson@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps
  2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson
@ 2021-03-17 13:32   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 9+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-03-17 13:32 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: peter.maydell

On 3/16/21 11:07 PM, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/tcg.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 79 insertions(+)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/5] tcg: Issue memory barriers for guest memory model
  2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
                   ` (5 preceding siblings ...)
  2021-03-16 22:30 ` [PATCH 0/5] tcg: Issue memory barriers for guest memory model no-reply
@ 2021-06-26  6:06 ` Richard Henderson
  6 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2021-06-26  6:06 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell

Ping.  A local rebase seems to apply clean.

r~

On 3/16/21 3:07 PM, Richard Henderson wrote:
> This is intending to fix the current aarch64 host failure
> for s390x guest cdrom-test.  This is caused by the io thread
> issuing memory barriers that are supposed to be matched by
> the vcpu, but are elided by tcg in rr mode as "unnecessary".
> 
> I know Peter would like a smaller patch to sync the io thread
> with the vcpu thread.  I've made a couple of attempts at this,
> but havn't managed to get something reliable (although now
> irritatingly infrequent -- about 1 in 500).
> 
> I have further patches to further optimize barriers, and to
> generate load-acquire/store-release instructions in tcg.
> But it's late in the release cycle, etc etc.
> 
> I've done nothing to measure the performance impact of this.
> I quit the cdtom-test cycle after 4000 passes.
> 
> 
> r~
> 
> 
> Richard Henderson (5):
>    tcg: Decode the operand to INDEX_op_mb in dumps
>    tcg: Do not elide memory barriers for CF_PARALLEL
>    tcg: Elide memory barriers implied by the host memory model
>    tcg: Create tcg_req_mo
>    tcg: Add host memory barriers to cpu_ldst.h interfaces
> 
>   include/exec/cpu_ldst.h |  7 ++++
>   include/tcg/tcg.h       | 20 +++++++++++
>   accel/tcg/cputlb.c      |  2 ++
>   accel/tcg/tcg-all.c     |  6 +---
>   accel/tcg/user-exec.c   | 17 +++++++++
>   tcg/tcg-op.c            | 19 +++++-----
>   tcg/tcg.c               | 79 +++++++++++++++++++++++++++++++++++++++++
>   7 files changed, 137 insertions(+), 13 deletions(-)
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-06-26  6:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson
2021-03-17 13:32   ` Philippe Mathieu-Daudé
2021-03-16 22:07 ` [PATCH 2/5] tcg: Do not elide memory barriers for CF_PARALLEL Richard Henderson
2021-03-16 22:07 ` [PATCH 3/5] tcg: Elide memory barriers implied by the host memory model Richard Henderson
2021-03-16 22:07 ` [PATCH 4/5] tcg: Create tcg_req_mo Richard Henderson
2021-03-16 22:07 ` [PATCH 5/5] tcg: Add host memory barriers to cpu_ldst.h interfaces Richard Henderson
2021-03-16 22:30 ` [PATCH 0/5] tcg: Issue memory barriers for guest memory model no-reply
2021-06-26  6:06 ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.