* [PATCH 0/5] tcg: Issue memory barriers for guest memory model
@ 2021-03-16 22:07 Richard Henderson
2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson
` (6 more replies)
0 siblings, 7 replies; 9+ messages in thread
From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell
This is intending to fix the current aarch64 host failure
for s390x guest cdrom-test. This is caused by the io thread
issuing memory barriers that are supposed to be matched by
the vcpu, but are elided by tcg in rr mode as "unnecessary".
I know Peter would like a smaller patch to sync the io thread
with the vcpu thread. I've made a couple of attempts at this,
but havn't managed to get something reliable (although now
irritatingly infrequent -- about 1 in 500).
I have further patches to further optimize barriers, and to
generate load-acquire/store-release instructions in tcg.
But it's late in the release cycle, etc etc.
I've done nothing to measure the performance impact of this.
I quit the cdtom-test cycle after 4000 passes.
r~
Richard Henderson (5):
tcg: Decode the operand to INDEX_op_mb in dumps
tcg: Do not elide memory barriers for CF_PARALLEL
tcg: Elide memory barriers implied by the host memory model
tcg: Create tcg_req_mo
tcg: Add host memory barriers to cpu_ldst.h interfaces
include/exec/cpu_ldst.h | 7 ++++
include/tcg/tcg.h | 20 +++++++++++
accel/tcg/cputlb.c | 2 ++
accel/tcg/tcg-all.c | 6 +---
accel/tcg/user-exec.c | 17 +++++++++
tcg/tcg-op.c | 19 +++++-----
tcg/tcg.c | 79 +++++++++++++++++++++++++++++++++++++++++
7 files changed, 137 insertions(+), 13 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps
2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
@ 2021-03-16 22:07 ` Richard Henderson
2021-03-17 13:32 ` Philippe Mathieu-Daudé
2021-03-16 22:07 ` [PATCH 2/5] tcg: Do not elide memory barriers for CF_PARALLEL Richard Henderson
` (5 subsequent siblings)
6 siblings, 1 reply; 9+ messages in thread
From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 79 insertions(+)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 2991112829..23a94d771c 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2415,6 +2415,85 @@ static void tcg_dump_ops(TCGContext *s, bool have_prefs)
arg_label(op->args[k])->id);
i++, k++;
break;
+ case INDEX_op_mb:
+ {
+ TCGBar membar = op->args[k];
+ const char *b_op, *m_op;
+
+ switch (membar & TCG_BAR_SC) {
+ case 0:
+ b_op = "none";
+ break;
+ case TCG_BAR_LDAQ:
+ b_op = "acq";
+ break;
+ case TCG_BAR_STRL:
+ b_op = "rel";
+ break;
+ case TCG_BAR_SC:
+ b_op = "seq";
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+ switch (membar & TCG_MO_ALL) {
+ case 0:
+ m_op = "none";
+ break;
+ case TCG_MO_LD_LD:
+ m_op = "rr";
+ break;
+ case TCG_MO_LD_ST:
+ m_op = "rw";
+ break;
+ case TCG_MO_ST_LD:
+ m_op = "wr";
+ break;
+ case TCG_MO_ST_ST:
+ m_op = "ww";
+ break;
+ case TCG_MO_LD_LD | TCG_MO_LD_ST:
+ m_op = "rr+rw";
+ break;
+ case TCG_MO_LD_LD | TCG_MO_ST_LD:
+ m_op = "rr+wr";
+ break;
+ case TCG_MO_LD_LD | TCG_MO_ST_ST:
+ m_op = "rr+ww";
+ break;
+ case TCG_MO_LD_ST | TCG_MO_ST_LD:
+ m_op = "rw+wr";
+ break;
+ case TCG_MO_LD_ST | TCG_MO_ST_ST:
+ m_op = "rw+ww";
+ break;
+ case TCG_MO_ST_LD | TCG_MO_ST_ST:
+ m_op = "wr+ww";
+ break;
+ case TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_LD:
+ m_op = "rr+rw+wr";
+ break;
+ case TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST:
+ m_op = "rr+rw+ww";
+ break;
+ case TCG_MO_LD_LD | TCG_MO_ST_LD | TCG_MO_ST_ST:
+ m_op = "rr+wr+ww";
+ break;
+ case TCG_MO_LD_ST | TCG_MO_ST_LD | TCG_MO_ST_ST:
+ m_op = "rw+wr+ww";
+ break;
+ case TCG_MO_ALL:
+ m_op = "all";
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+ col += qemu_log("%s%s:%s", (k ? "," : ""), b_op, m_op);
+ i++, k++;
+ }
+ break;
default:
break;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/5] tcg: Do not elide memory barriers for CF_PARALLEL
2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson
@ 2021-03-16 22:07 ` Richard Henderson
2021-03-16 22:07 ` [PATCH 3/5] tcg: Elide memory barriers implied by the host memory model Richard Henderson
` (4 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell
The virtio devices require proper memory ordering between
the vcpus and the iothreads.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg-op.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 70475773f4..76dc7d8dc5 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -97,9 +97,13 @@ void tcg_gen_op6(TCGOpcode opc, TCGArg a1, TCGArg a2, TCGArg a3,
void tcg_gen_mb(TCGBar mb_type)
{
- if (tcg_ctx->tb_cflags & CF_PARALLEL) {
- tcg_gen_op1(INDEX_op_mb, mb_type);
- }
+ /*
+ * It is tempting to elide the barrier in a single-threaded context
+ * (i.e. !(tb_cflags & CF_PARALLEL)), however, even with a single cpu
+ * we have i/o threads running in parallel, and lack of memory order
+ * can result in e.g. virtio queue entries being read incorrectly.
+ */
+ tcg_gen_op1(INDEX_op_mb, mb_type);
}
/* 32 bit ops */
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/5] tcg: Elide memory barriers implied by the host memory model
2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson
2021-03-16 22:07 ` [PATCH 2/5] tcg: Do not elide memory barriers for CF_PARALLEL Richard Henderson
@ 2021-03-16 22:07 ` Richard Henderson
2021-03-16 22:07 ` [PATCH 4/5] tcg: Create tcg_req_mo Richard Henderson
` (3 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell
Reduce the set of required barriers to those needed by
the host right from the beginning.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg-op.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 76dc7d8dc5..c8501508c2 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -102,8 +102,13 @@ void tcg_gen_mb(TCGBar mb_type)
* (i.e. !(tb_cflags & CF_PARALLEL)), however, even with a single cpu
* we have i/o threads running in parallel, and lack of memory order
* can result in e.g. virtio queue entries being read incorrectly.
+ *
+ * That said, we can elide anything which the host provides for free.
*/
- tcg_gen_op1(INDEX_op_mb, mb_type);
+ mb_type &= ~TCG_TARGET_DEFAULT_MO;
+ if (mb_type & TCG_MO_ALL) {
+ tcg_gen_op1(INDEX_op_mb, mb_type);
+ }
}
/* 32 bit ops */
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 4/5] tcg: Create tcg_req_mo
2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
` (2 preceding siblings ...)
2021-03-16 22:07 ` [PATCH 3/5] tcg: Elide memory barriers implied by the host memory model Richard Henderson
@ 2021-03-16 22:07 ` Richard Henderson
2021-03-16 22:07 ` [PATCH 5/5] tcg: Add host memory barriers to cpu_ldst.h interfaces Richard Henderson
` (2 subsequent siblings)
6 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell
Split out the logic to emit a host memory barrier in response to
a guest memory operation. Do not provide a true default for
TCG_GUEST_DEFAULT_MO because the defined() check will still be
useful for determining if a guest has been updated for MTTCG.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/tcg/tcg.h | 20 ++++++++++++++++++++
accel/tcg/tcg-all.c | 6 +-----
tcg/tcg-op.c | 8 +-------
3 files changed, 22 insertions(+), 12 deletions(-)
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 0f0695e90d..395b3b6964 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -1245,6 +1245,26 @@ static inline unsigned get_mmuidx(TCGMemOpIdx oi)
return oi & 15;
}
+/**
+ * tcg_req_mo:
+ * @type: TCGBar
+ *
+ * Filter @type to the barrier that is required for the guest
+ * memory ordering vs the host memory ordering. A non-zero
+ * result indicates that some barrier is required.
+ *
+ * If TCG_GUEST_DEFAULT_MO is not defined, assume that the
+ * guest requires strict alignment.
+ *
+ * This is a macro so that it's constant even without optimization.
+ */
+#ifdef TCG_GUEST_DEFAULT_MO
+# define tcg_req_mo(type) \
+ ((type) & TCG_GUEST_DEFAULT_MO & ~TCG_TARGET_DEFAULT_MO)
+#else
+# define tcg_req_mo(type) ((type) & ~TCG_TARGET_DEFAULT_MO)
+#endif
+
/**
* tcg_qemu_tb_exec:
* @env: pointer to CPUArchState for the CPU
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index e378c2db73..6ae51e3476 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -69,11 +69,7 @@ DECLARE_INSTANCE_CHECKER(TCGState, TCG_STATE,
static bool check_tcg_memory_orders_compatible(void)
{
-#if defined(TCG_GUEST_DEFAULT_MO) && defined(TCG_TARGET_DEFAULT_MO)
- return (TCG_GUEST_DEFAULT_MO & ~TCG_TARGET_DEFAULT_MO) == 0;
-#else
- return false;
-#endif
+ return tcg_req_mo(TCG_MO_ALL) == 0;
}
static bool default_mttcg_enabled(void)
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index c8501508c2..12fc8a1b17 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2796,13 +2796,7 @@ static void gen_ldst_i64(TCGOpcode opc, TCGv_i64 val, TCGv addr,
static void tcg_gen_req_mo(TCGBar type)
{
-#ifdef TCG_GUEST_DEFAULT_MO
- type &= TCG_GUEST_DEFAULT_MO;
-#endif
- type &= ~TCG_TARGET_DEFAULT_MO;
- if (type) {
- tcg_gen_mb(type | TCG_BAR_SC);
- }
+ tcg_gen_mb(tcg_req_mo(type) | TCG_BAR_SC);
}
static inline TCGv plugin_prep_mem_callbacks(TCGv vaddr)
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 5/5] tcg: Add host memory barriers to cpu_ldst.h interfaces
2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
` (3 preceding siblings ...)
2021-03-16 22:07 ` [PATCH 4/5] tcg: Create tcg_req_mo Richard Henderson
@ 2021-03-16 22:07 ` Richard Henderson
2021-03-16 22:30 ` [PATCH 0/5] tcg: Issue memory barriers for guest memory model no-reply
2021-06-26 6:06 ` Richard Henderson
6 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2021-03-16 22:07 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell
Bring the majority of helpers into line with the rest of
tcg in respecting guest memory ordering.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/exec/cpu_ldst.h | 7 +++++++
accel/tcg/cputlb.c | 2 ++
accel/tcg/user-exec.c | 17 +++++++++++++++++
3 files changed, 26 insertions(+)
diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h
index ce6ce82618..f0ab79fe3c 100644
--- a/include/exec/cpu_ldst.h
+++ b/include/exec/cpu_ldst.h
@@ -169,6 +169,13 @@ void cpu_stl_le_data_ra(CPUArchState *env, abi_ptr ptr,
void cpu_stq_le_data_ra(CPUArchState *env, abi_ptr ptr,
uint64_t val, uintptr_t ra);
+#define cpu_req_mo(type) \
+ do { \
+ if (tcg_req_mo(type)) { \
+ smp_mb(); \
+ } \
+ } while (0)
+
#if defined(CONFIG_USER_ONLY)
extern __thread uintptr_t helper_retaddr;
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 8a7b779270..a3503eaa71 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2100,6 +2100,7 @@ static inline uint64_t cpu_load_helper(CPUArchState *env, abi_ptr addr,
meminfo = trace_mem_get_info(op, mmu_idx, false);
trace_guest_mem_before_exec(env_cpu(env), addr, meminfo);
+ cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
op &= ~MO_SIGN;
oi = make_memop_idx(op, mmu_idx);
ret = full_load(env, addr, oi, retaddr);
@@ -2542,6 +2543,7 @@ cpu_store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
meminfo = trace_mem_get_info(op, mmu_idx, true);
trace_guest_mem_before_exec(env_cpu(env), addr, meminfo);
+ cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
oi = make_memop_idx(op, mmu_idx);
store_helper(env, addr, val, oi, retaddr, op);
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index 0d8cc27b21..34f6dfcef4 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -843,6 +843,7 @@ uint32_t cpu_ldub_data(CPUArchState *env, abi_ptr ptr)
uint16_t meminfo = trace_mem_get_info(MO_UB, MMU_USER_IDX, false);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
ret = ldub_p(g2h(env_cpu(env), ptr));
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
return ret;
@@ -854,6 +855,7 @@ int cpu_ldsb_data(CPUArchState *env, abi_ptr ptr)
uint16_t meminfo = trace_mem_get_info(MO_SB, MMU_USER_IDX, false);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
ret = ldsb_p(g2h(env_cpu(env), ptr));
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
return ret;
@@ -865,6 +867,7 @@ uint32_t cpu_lduw_be_data(CPUArchState *env, abi_ptr ptr)
uint16_t meminfo = trace_mem_get_info(MO_BEUW, MMU_USER_IDX, false);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
ret = lduw_be_p(g2h(env_cpu(env), ptr));
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
return ret;
@@ -876,6 +879,7 @@ int cpu_ldsw_be_data(CPUArchState *env, abi_ptr ptr)
uint16_t meminfo = trace_mem_get_info(MO_BESW, MMU_USER_IDX, false);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
ret = ldsw_be_p(g2h(env_cpu(env), ptr));
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
return ret;
@@ -887,6 +891,7 @@ uint32_t cpu_ldl_be_data(CPUArchState *env, abi_ptr ptr)
uint16_t meminfo = trace_mem_get_info(MO_BEUL, MMU_USER_IDX, false);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
ret = ldl_be_p(g2h(env_cpu(env), ptr));
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
return ret;
@@ -898,6 +903,7 @@ uint64_t cpu_ldq_be_data(CPUArchState *env, abi_ptr ptr)
uint16_t meminfo = trace_mem_get_info(MO_BEQ, MMU_USER_IDX, false);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
ret = ldq_be_p(g2h(env_cpu(env), ptr));
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
return ret;
@@ -909,6 +915,7 @@ uint32_t cpu_lduw_le_data(CPUArchState *env, abi_ptr ptr)
uint16_t meminfo = trace_mem_get_info(MO_LEUW, MMU_USER_IDX, false);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
ret = lduw_le_p(g2h(env_cpu(env), ptr));
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
return ret;
@@ -920,6 +927,7 @@ int cpu_ldsw_le_data(CPUArchState *env, abi_ptr ptr)
uint16_t meminfo = trace_mem_get_info(MO_LESW, MMU_USER_IDX, false);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
ret = ldsw_le_p(g2h(env_cpu(env), ptr));
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
return ret;
@@ -931,6 +939,7 @@ uint32_t cpu_ldl_le_data(CPUArchState *env, abi_ptr ptr)
uint16_t meminfo = trace_mem_get_info(MO_LEUL, MMU_USER_IDX, false);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
ret = ldl_le_p(g2h(env_cpu(env), ptr));
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
return ret;
@@ -942,6 +951,7 @@ uint64_t cpu_ldq_le_data(CPUArchState *env, abi_ptr ptr)
uint16_t meminfo = trace_mem_get_info(MO_LEQ, MMU_USER_IDX, false);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
ret = ldq_le_p(g2h(env_cpu(env), ptr));
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
return ret;
@@ -1052,6 +1062,7 @@ void cpu_stb_data(CPUArchState *env, abi_ptr ptr, uint32_t val)
uint16_t meminfo = trace_mem_get_info(MO_UB, MMU_USER_IDX, true);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
stb_p(g2h(env_cpu(env), ptr), val);
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
}
@@ -1061,6 +1072,7 @@ void cpu_stw_be_data(CPUArchState *env, abi_ptr ptr, uint32_t val)
uint16_t meminfo = trace_mem_get_info(MO_BEUW, MMU_USER_IDX, true);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
stw_be_p(g2h(env_cpu(env), ptr), val);
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
}
@@ -1070,6 +1082,7 @@ void cpu_stl_be_data(CPUArchState *env, abi_ptr ptr, uint32_t val)
uint16_t meminfo = trace_mem_get_info(MO_BEUL, MMU_USER_IDX, true);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
stl_be_p(g2h(env_cpu(env), ptr), val);
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
}
@@ -1079,6 +1092,7 @@ void cpu_stq_be_data(CPUArchState *env, abi_ptr ptr, uint64_t val)
uint16_t meminfo = trace_mem_get_info(MO_BEQ, MMU_USER_IDX, true);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
stq_be_p(g2h(env_cpu(env), ptr), val);
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
}
@@ -1088,6 +1102,7 @@ void cpu_stw_le_data(CPUArchState *env, abi_ptr ptr, uint32_t val)
uint16_t meminfo = trace_mem_get_info(MO_LEUW, MMU_USER_IDX, true);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
stw_le_p(g2h(env_cpu(env), ptr), val);
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
}
@@ -1097,6 +1112,7 @@ void cpu_stl_le_data(CPUArchState *env, abi_ptr ptr, uint32_t val)
uint16_t meminfo = trace_mem_get_info(MO_LEUL, MMU_USER_IDX, true);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
stl_le_p(g2h(env_cpu(env), ptr), val);
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
}
@@ -1106,6 +1122,7 @@ void cpu_stq_le_data(CPUArchState *env, abi_ptr ptr, uint64_t val)
uint16_t meminfo = trace_mem_get_info(MO_LEQ, MMU_USER_IDX, true);
trace_guest_mem_before_exec(env_cpu(env), ptr, meminfo);
+ cpu_req_mo(TCG_MO_LD_ST | TCG_MO_ST_ST);
stq_le_p(g2h(env_cpu(env), ptr), val);
qemu_plugin_vcpu_mem_cb(env_cpu(env), ptr, meminfo);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 0/5] tcg: Issue memory barriers for guest memory model
2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
` (4 preceding siblings ...)
2021-03-16 22:07 ` [PATCH 5/5] tcg: Add host memory barriers to cpu_ldst.h interfaces Richard Henderson
@ 2021-03-16 22:30 ` no-reply
2021-06-26 6:06 ` Richard Henderson
6 siblings, 0 replies; 9+ messages in thread
From: no-reply @ 2021-03-16 22:30 UTC (permalink / raw)
To: richard.henderson; +Cc: peter.maydell, qemu-devel
Patchew URL: https://patchew.org/QEMU/20210316220735.2048137-1-richard.henderson@linaro.org/
Hi,
This series seems to have some coding style problems. See output below for
more information:
Type: series
Message-id: 20210316220735.2048137-1-richard.henderson@linaro.org
Subject: [PATCH 0/5] tcg: Issue memory barriers for guest memory model
=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===
Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
- [tag update] patchew/20210311143958.562625-1-richard.henderson@linaro.org -> patchew/20210311143958.562625-1-richard.henderson@linaro.org
* [new tag] patchew/20210316220735.2048137-1-richard.henderson@linaro.org -> patchew/20210316220735.2048137-1-richard.henderson@linaro.org
Switched to a new branch 'test'
06ceb5a tcg: Add host memory barriers to cpu_ldst.h interfaces
be4ade5 tcg: Create tcg_req_mo
1336778 tcg: Elide memory barriers implied by the host memory model
d0f90d5 tcg: Do not elide memory barriers for CF_PARALLEL
c9f634b tcg: Decode the operand to INDEX_op_mb in dumps
=== OUTPUT BEGIN ===
1/5 Checking commit c9f634bdbe20 (tcg: Decode the operand to INDEX_op_mb in dumps)
2/5 Checking commit d0f90d584f17 (tcg: Do not elide memory barriers for CF_PARALLEL)
3/5 Checking commit 133677838f14 (tcg: Elide memory barriers implied by the host memory model)
4/5 Checking commit be4ade51a457 (tcg: Create tcg_req_mo)
5/5 Checking commit 06ceb5ad212a (tcg: Add host memory barriers to cpu_ldst.h interfaces)
ERROR: memory barrier without comment
#189: FILE: include/exec/cpu_ldst.h:175:
+ smp_mb(); \
total: 1 errors, 0 warnings, 146 lines checked
Patch 5/5 has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===
Test command exited with code: 1
The full log is available at
http://patchew.org/logs/20210316220735.2048137-1-richard.henderson@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps
2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson
@ 2021-03-17 13:32 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 9+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-03-17 13:32 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: peter.maydell
On 3/16/21 11:07 PM, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> tcg/tcg.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 79 insertions(+)
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 0/5] tcg: Issue memory barriers for guest memory model
2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
` (5 preceding siblings ...)
2021-03-16 22:30 ` [PATCH 0/5] tcg: Issue memory barriers for guest memory model no-reply
@ 2021-06-26 6:06 ` Richard Henderson
6 siblings, 0 replies; 9+ messages in thread
From: Richard Henderson @ 2021-06-26 6:06 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell
Ping. A local rebase seems to apply clean.
r~
On 3/16/21 3:07 PM, Richard Henderson wrote:
> This is intending to fix the current aarch64 host failure
> for s390x guest cdrom-test. This is caused by the io thread
> issuing memory barriers that are supposed to be matched by
> the vcpu, but are elided by tcg in rr mode as "unnecessary".
>
> I know Peter would like a smaller patch to sync the io thread
> with the vcpu thread. I've made a couple of attempts at this,
> but havn't managed to get something reliable (although now
> irritatingly infrequent -- about 1 in 500).
>
> I have further patches to further optimize barriers, and to
> generate load-acquire/store-release instructions in tcg.
> But it's late in the release cycle, etc etc.
>
> I've done nothing to measure the performance impact of this.
> I quit the cdtom-test cycle after 4000 passes.
>
>
> r~
>
>
> Richard Henderson (5):
> tcg: Decode the operand to INDEX_op_mb in dumps
> tcg: Do not elide memory barriers for CF_PARALLEL
> tcg: Elide memory barriers implied by the host memory model
> tcg: Create tcg_req_mo
> tcg: Add host memory barriers to cpu_ldst.h interfaces
>
> include/exec/cpu_ldst.h | 7 ++++
> include/tcg/tcg.h | 20 +++++++++++
> accel/tcg/cputlb.c | 2 ++
> accel/tcg/tcg-all.c | 6 +---
> accel/tcg/user-exec.c | 17 +++++++++
> tcg/tcg-op.c | 19 +++++-----
> tcg/tcg.c | 79 +++++++++++++++++++++++++++++++++++++++++
> 7 files changed, 137 insertions(+), 13 deletions(-)
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-06-26 6:07 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-16 22:07 [PATCH 0/5] tcg: Issue memory barriers for guest memory model Richard Henderson
2021-03-16 22:07 ` [PATCH 1/5] tcg: Decode the operand to INDEX_op_mb in dumps Richard Henderson
2021-03-17 13:32 ` Philippe Mathieu-Daudé
2021-03-16 22:07 ` [PATCH 2/5] tcg: Do not elide memory barriers for CF_PARALLEL Richard Henderson
2021-03-16 22:07 ` [PATCH 3/5] tcg: Elide memory barriers implied by the host memory model Richard Henderson
2021-03-16 22:07 ` [PATCH 4/5] tcg: Create tcg_req_mo Richard Henderson
2021-03-16 22:07 ` [PATCH 5/5] tcg: Add host memory barriers to cpu_ldst.h interfaces Richard Henderson
2021-03-16 22:30 ` [PATCH 0/5] tcg: Issue memory barriers for guest memory model no-reply
2021-06-26 6:06 ` Richard Henderson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.