* [PULL 0/5] tcg patch queue
@ 2020-09-03 21:40 Richard Henderson
2020-09-03 21:40 ` [PULL 1/5] cputlb: Make store_helper less fragile to compiler optimizations Richard Henderson
` (5 more replies)
0 siblings, 6 replies; 15+ messages in thread
From: Richard Henderson @ 2020-09-03 21:40 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell
The following changes since commit 3dd23a4fb8fd72d2220a90a809f213999ffe7f3a:
Merge remote-tracking branch 'remotes/legoater/tags/pull-aspeed-20200901' into staging (2020-09-03 14:12:48 +0100)
are available in the Git repository at:
https://github.com/rth7680/qemu.git tags/pull-tcg-20200903
for you to fetch changes up to fe4b0b5bfa96c38ad1cad0689a86cca9f307e353:
tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem (2020-09-03 13:13:58 -0700)
----------------------------------------------------------------
Improve inlining in cputlb.c.
Fix vector abs fallback.
Only set parallel_cpus for SMP.
Add vector dupm for 256-bit elements.
----------------------------------------------------------------
Richard Henderson (4):
cputlb: Make store_helper less fragile to compiler optimizations
softmmu/cpus: Only set parallel_cpus for SMP
tcg: Eliminate one store for in-place 128-bit dup_mem
tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem
Stephen Long (1):
tcg: Fix tcg gen for vectorized absolute value
accel/tcg/cputlb.c | 138 ++++++++++++++++++++++++++++++-----------------------
softmmu/cpus.c | 11 ++++-
tcg/tcg-op-gvec.c | 61 ++++++++++++++++++++---
3 files changed, 143 insertions(+), 67 deletions(-)
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PULL 1/5] cputlb: Make store_helper less fragile to compiler optimizations
2020-09-03 21:40 [PULL 0/5] tcg patch queue Richard Henderson
@ 2020-09-03 21:40 ` Richard Henderson
2020-09-03 21:40 ` [PULL 2/5] tcg: Fix tcg gen for vectorized absolute value Richard Henderson
` (4 subsequent siblings)
5 siblings, 0 replies; 15+ messages in thread
From: Richard Henderson @ 2020-09-03 21:40 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, Alex Bennée, Shu-Chun Weng
This has no functional change.
The current function structure is:
inline QEMU_ALWAYSINLINE
store_memop() {
switch () {
...
default:
qemu_build_not_reached();
}
}
inline QEMU_ALWAYSINLINE
store_helper() {
...
if (span_two_pages_or_io) {
...
helper_ret_stb_mmu();
}
store_memop();
}
helper_ret_stb_mmu() {
store_helper();
}
Whereas GCC will generate an error at compile-time when an always_inline
function is not inlined, Clang does not. Nor does Clang prioritize the
inlining of always_inline functions. Both of these are arguably bugs.
Both `store_memop` and `store_helper` need to be inlined and allow
constant propogations to eliminate the `qemu_build_not_reached` call.
However, if the compiler instead chooses to inline helper_ret_stb_mmu
into store_helper, then store_helper is now self-recursive and the
compiler is no longer able to propagate the constant in the same way.
This does not produce at current QEMU head, but was reproducible
at v4.2.0 with `clang-10 -O2 -fexperimental-new-pass-manager`.
The inline recursion problem can be fixed solely by marking
helper_ret_stb_mmu as noinline, so the compiler does not make an
incorrect decision about which functions to inline.
In addition, extract store_helper_unaligned as a noinline subroutine
that can be shared by all of the helpers. This saves about 6k code
size in an optimized x86_64 build.
Reported-by: Shu-Chun Weng <scw@google.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/cputlb.c | 138 ++++++++++++++++++++++++++-------------------
1 file changed, 79 insertions(+), 59 deletions(-)
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 2d48281942..6489abbf8c 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2009,6 +2009,80 @@ store_memop(void *haddr, uint64_t val, MemOp op)
}
}
+static void __attribute__((noinline))
+store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val,
+ uintptr_t retaddr, size_t size, uintptr_t mmu_idx,
+ bool big_endian)
+{
+ const size_t tlb_off = offsetof(CPUTLBEntry, addr_write);
+ uintptr_t index, index2;
+ CPUTLBEntry *entry, *entry2;
+ target_ulong page2, tlb_addr, tlb_addr2;
+ TCGMemOpIdx oi;
+ size_t size2;
+ int i;
+
+ /*
+ * Ensure the second page is in the TLB. Note that the first page
+ * is already guaranteed to be filled, and that the second page
+ * cannot evict the first.
+ */
+ page2 = (addr + size) & TARGET_PAGE_MASK;
+ size2 = (addr + size) & ~TARGET_PAGE_MASK;
+ index2 = tlb_index(env, mmu_idx, page2);
+ entry2 = tlb_entry(env, mmu_idx, page2);
+
+ tlb_addr2 = tlb_addr_write(entry2);
+ if (!tlb_hit_page(tlb_addr2, page2)) {
+ if (!victim_tlb_hit(env, mmu_idx, index2, tlb_off, page2)) {
+ tlb_fill(env_cpu(env), page2, size2, MMU_DATA_STORE,
+ mmu_idx, retaddr);
+ index2 = tlb_index(env, mmu_idx, page2);
+ entry2 = tlb_entry(env, mmu_idx, page2);
+ }
+ tlb_addr2 = tlb_addr_write(entry2);
+ }
+
+ index = tlb_index(env, mmu_idx, addr);
+ entry = tlb_entry(env, mmu_idx, addr);
+ tlb_addr = tlb_addr_write(entry);
+
+ /*
+ * Handle watchpoints. Since this may trap, all checks
+ * must happen before any store.
+ */
+ if (unlikely(tlb_addr & TLB_WATCHPOINT)) {
+ cpu_check_watchpoint(env_cpu(env), addr, size - size2,
+ env_tlb(env)->d[mmu_idx].iotlb[index].attrs,
+ BP_MEM_WRITE, retaddr);
+ }
+ if (unlikely(tlb_addr2 & TLB_WATCHPOINT)) {
+ cpu_check_watchpoint(env_cpu(env), page2, size2,
+ env_tlb(env)->d[mmu_idx].iotlb[index2].attrs,
+ BP_MEM_WRITE, retaddr);
+ }
+
+ /*
+ * XXX: not efficient, but simple.
+ * This loop must go in the forward direction to avoid issues
+ * with self-modifying code in Windows 64-bit.
+ */
+ oi = make_memop_idx(MO_UB, mmu_idx);
+ if (big_endian) {
+ for (i = 0; i < size; ++i) {
+ /* Big-endian extract. */
+ uint8_t val8 = val >> (((size - 1) * 8) - (i * 8));
+ helper_ret_stb_mmu(env, addr + i, val8, oi, retaddr);
+ }
+ } else {
+ for (i = 0; i < size; ++i) {
+ /* Little-endian extract. */
+ uint8_t val8 = val >> (i * 8);
+ helper_ret_stb_mmu(env, addr + i, val8, oi, retaddr);
+ }
+ }
+}
+
static inline void QEMU_ALWAYS_INLINE
store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
TCGMemOpIdx oi, uintptr_t retaddr, MemOp op)
@@ -2097,64 +2171,9 @@ store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
if (size > 1
&& unlikely((addr & ~TARGET_PAGE_MASK) + size - 1
>= TARGET_PAGE_SIZE)) {
- int i;
- uintptr_t index2;
- CPUTLBEntry *entry2;
- target_ulong page2, tlb_addr2;
- size_t size2;
-
do_unaligned_access:
- /*
- * Ensure the second page is in the TLB. Note that the first page
- * is already guaranteed to be filled, and that the second page
- * cannot evict the first.
- */
- page2 = (addr + size) & TARGET_PAGE_MASK;
- size2 = (addr + size) & ~TARGET_PAGE_MASK;
- index2 = tlb_index(env, mmu_idx, page2);
- entry2 = tlb_entry(env, mmu_idx, page2);
- tlb_addr2 = tlb_addr_write(entry2);
- if (!tlb_hit_page(tlb_addr2, page2)) {
- if (!victim_tlb_hit(env, mmu_idx, index2, tlb_off, page2)) {
- tlb_fill(env_cpu(env), page2, size2, MMU_DATA_STORE,
- mmu_idx, retaddr);
- index2 = tlb_index(env, mmu_idx, page2);
- entry2 = tlb_entry(env, mmu_idx, page2);
- }
- tlb_addr2 = tlb_addr_write(entry2);
- }
-
- /*
- * Handle watchpoints. Since this may trap, all checks
- * must happen before any store.
- */
- if (unlikely(tlb_addr & TLB_WATCHPOINT)) {
- cpu_check_watchpoint(env_cpu(env), addr, size - size2,
- env_tlb(env)->d[mmu_idx].iotlb[index].attrs,
- BP_MEM_WRITE, retaddr);
- }
- if (unlikely(tlb_addr2 & TLB_WATCHPOINT)) {
- cpu_check_watchpoint(env_cpu(env), page2, size2,
- env_tlb(env)->d[mmu_idx].iotlb[index2].attrs,
- BP_MEM_WRITE, retaddr);
- }
-
- /*
- * XXX: not efficient, but simple.
- * This loop must go in the forward direction to avoid issues
- * with self-modifying code in Windows 64-bit.
- */
- for (i = 0; i < size; ++i) {
- uint8_t val8;
- if (memop_big_endian(op)) {
- /* Big-endian extract. */
- val8 = val >> (((size - 1) * 8) - (i * 8));
- } else {
- /* Little-endian extract. */
- val8 = val >> (i * 8);
- }
- helper_ret_stb_mmu(env, addr + i, val8, oi, retaddr);
- }
+ store_helper_unaligned(env, addr, val, retaddr, size,
+ mmu_idx, memop_big_endian(op));
return;
}
@@ -2162,8 +2181,9 @@ store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
store_memop(haddr, val, op);
}
-void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val,
- TCGMemOpIdx oi, uintptr_t retaddr)
+void __attribute__((noinline))
+helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val,
+ TCGMemOpIdx oi, uintptr_t retaddr)
{
store_helper(env, addr, val, oi, retaddr, MO_UB);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PULL 2/5] tcg: Fix tcg gen for vectorized absolute value
2020-09-03 21:40 [PULL 0/5] tcg patch queue Richard Henderson
2020-09-03 21:40 ` [PULL 1/5] cputlb: Make store_helper less fragile to compiler optimizations Richard Henderson
@ 2020-09-03 21:40 ` Richard Henderson
2020-09-03 21:40 ` [PULL 3/5] softmmu/cpus: Only set parallel_cpus for SMP Richard Henderson
` (3 subsequent siblings)
5 siblings, 0 replies; 15+ messages in thread
From: Richard Henderson @ 2020-09-03 21:40 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, Stephen Long
From: Stephen Long <steplong@quicinc.com>
The fallback inline expansion for vectorized absolute value,
when the host doesn't support such an insn was flawed.
E.g. when a vector of bytes has all elements negative, mask
will be 0xffff_ffff_ffff_ffff. Subtracting mask only adds 1
to the low element instead of all elements becase -mask is 1
and not 0x0101_0101_0101_0101.
Signed-off-by: Stephen Long <steplong@quicinc.com>
Message-Id: <20200813161818.190-1-steplong@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg-op-gvec.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
index 3707c0effb..793d4ba64c 100644
--- a/tcg/tcg-op-gvec.c
+++ b/tcg/tcg-op-gvec.c
@@ -2264,12 +2264,13 @@ static void gen_absv_mask(TCGv_i64 d, TCGv_i64 b, unsigned vece)
tcg_gen_muli_i64(t, t, (1 << nbit) - 1);
/*
- * Invert (via xor -1) and add one (via sub -1).
+ * Invert (via xor -1) and add one.
* Because of the ordering the msb is cleared,
* so we never have carry into the next element.
*/
tcg_gen_xor_i64(d, b, t);
- tcg_gen_sub_i64(d, d, t);
+ tcg_gen_andi_i64(t, t, dup_const(vece, 1));
+ tcg_gen_add_i64(d, d, t);
tcg_temp_free_i64(t);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PULL 3/5] softmmu/cpus: Only set parallel_cpus for SMP
2020-09-03 21:40 [PULL 0/5] tcg patch queue Richard Henderson
2020-09-03 21:40 ` [PULL 1/5] cputlb: Make store_helper less fragile to compiler optimizations Richard Henderson
2020-09-03 21:40 ` [PULL 2/5] tcg: Fix tcg gen for vectorized absolute value Richard Henderson
@ 2020-09-03 21:40 ` Richard Henderson
2020-09-07 10:05 ` Claudio Fontana
2020-09-03 21:41 ` [PULL 4/5] tcg: Eliminate one store for in-place 128-bit dup_mem Richard Henderson
` (2 subsequent siblings)
5 siblings, 1 reply; 15+ messages in thread
From: Richard Henderson @ 2020-09-03 21:40 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé
Do not set parallel_cpus if there is only one cpu instantiated.
This will allow tcg to use serial code to implement atomics.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
softmmu/cpus.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index a802e899ab..e3b98065c9 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -1895,6 +1895,16 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
if (!tcg_region_inited) {
tcg_region_inited = 1;
tcg_region_init();
+ /*
+ * If MTTCG, and we will create multiple cpus,
+ * then we will have cpus running in parallel.
+ */
+ if (qemu_tcg_mttcg_enabled()) {
+ MachineState *ms = MACHINE(qdev_get_machine());
+ if (ms->smp.max_cpus > 1) {
+ parallel_cpus = true;
+ }
+ }
}
if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
@@ -1904,7 +1914,6 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
if (qemu_tcg_mttcg_enabled()) {
/* create a thread per vCPU with TCG (MTTCG) */
- parallel_cpus = true;
snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
cpu->cpu_index);
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PULL 4/5] tcg: Eliminate one store for in-place 128-bit dup_mem
2020-09-03 21:40 [PULL 0/5] tcg patch queue Richard Henderson
` (2 preceding siblings ...)
2020-09-03 21:40 ` [PULL 3/5] softmmu/cpus: Only set parallel_cpus for SMP Richard Henderson
@ 2020-09-03 21:41 ` Richard Henderson
2020-09-03 21:41 ` [PULL 5/5] tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem Richard Henderson
2020-09-06 13:07 ` [PULL 0/5] tcg patch queue Peter Maydell
5 siblings, 0 replies; 15+ messages in thread
From: Richard Henderson @ 2020-09-03 21:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé
Do not store back to the exact memory from which we just loaded.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg-op-gvec.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
index 793d4ba64c..fcc25b04e6 100644
--- a/tcg/tcg-op-gvec.c
+++ b/tcg/tcg-op-gvec.c
@@ -1581,7 +1581,7 @@ void tcg_gen_gvec_dup_mem(unsigned vece, uint32_t dofs, uint32_t aofs,
TCGv_vec in = tcg_temp_new_vec(TCG_TYPE_V128);
tcg_gen_ld_vec(in, cpu_env, aofs);
- for (i = 0; i < oprsz; i += 16) {
+ for (i = (aofs == dofs) * 16; i < oprsz; i += 16) {
tcg_gen_st_vec(in, cpu_env, dofs + i);
}
tcg_temp_free_vec(in);
@@ -1591,7 +1591,7 @@ void tcg_gen_gvec_dup_mem(unsigned vece, uint32_t dofs, uint32_t aofs,
tcg_gen_ld_i64(in0, cpu_env, aofs);
tcg_gen_ld_i64(in1, cpu_env, aofs + 8);
- for (i = 0; i < oprsz; i += 16) {
+ for (i = (aofs == dofs) * 16; i < oprsz; i += 16) {
tcg_gen_st_i64(in0, cpu_env, dofs + i);
tcg_gen_st_i64(in1, cpu_env, dofs + i + 8);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PULL 5/5] tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem
2020-09-03 21:40 [PULL 0/5] tcg patch queue Richard Henderson
` (3 preceding siblings ...)
2020-09-03 21:41 ` [PULL 4/5] tcg: Eliminate one store for in-place 128-bit dup_mem Richard Henderson
@ 2020-09-03 21:41 ` Richard Henderson
2020-09-06 13:07 ` [PULL 0/5] tcg patch queue Peter Maydell
5 siblings, 0 replies; 15+ messages in thread
From: Richard Henderson @ 2020-09-03 21:41 UTC (permalink / raw)
To: qemu-devel; +Cc: peter.maydell, Philippe Mathieu-Daudé
We already support duplication of 128-bit blocks. This extends
that support to 256-bit blocks. This will be needed by SVE2.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg-op-gvec.c | 52 ++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 49 insertions(+), 3 deletions(-)
diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
index fcc25b04e6..7ebd9e8298 100644
--- a/tcg/tcg-op-gvec.c
+++ b/tcg/tcg-op-gvec.c
@@ -1570,12 +1570,10 @@ void tcg_gen_gvec_dup_mem(unsigned vece, uint32_t dofs, uint32_t aofs,
do_dup(vece, dofs, oprsz, maxsz, NULL, in, 0);
tcg_temp_free_i64(in);
}
- } else {
+ } else if (vece == 4) {
/* 128-bit duplicate. */
- /* ??? Dup to 256-bit vector. */
int i;
- tcg_debug_assert(vece == 4);
tcg_debug_assert(oprsz >= 16);
if (TCG_TARGET_HAS_v128) {
TCGv_vec in = tcg_temp_new_vec(TCG_TYPE_V128);
@@ -1601,6 +1599,54 @@ void tcg_gen_gvec_dup_mem(unsigned vece, uint32_t dofs, uint32_t aofs,
if (oprsz < maxsz) {
expand_clr(dofs + oprsz, maxsz - oprsz);
}
+ } else if (vece == 5) {
+ /* 256-bit duplicate. */
+ int i;
+
+ tcg_debug_assert(oprsz >= 32);
+ tcg_debug_assert(oprsz % 32 == 0);
+ if (TCG_TARGET_HAS_v256) {
+ TCGv_vec in = tcg_temp_new_vec(TCG_TYPE_V256);
+
+ tcg_gen_ld_vec(in, cpu_env, aofs);
+ for (i = (aofs == dofs) * 32; i < oprsz; i += 32) {
+ tcg_gen_st_vec(in, cpu_env, dofs + i);
+ }
+ tcg_temp_free_vec(in);
+ } else if (TCG_TARGET_HAS_v128) {
+ TCGv_vec in0 = tcg_temp_new_vec(TCG_TYPE_V128);
+ TCGv_vec in1 = tcg_temp_new_vec(TCG_TYPE_V128);
+
+ tcg_gen_ld_vec(in0, cpu_env, aofs);
+ tcg_gen_ld_vec(in1, cpu_env, aofs + 16);
+ for (i = (aofs == dofs) * 32; i < oprsz; i += 32) {
+ tcg_gen_st_vec(in0, cpu_env, dofs + i);
+ tcg_gen_st_vec(in1, cpu_env, dofs + i + 16);
+ }
+ tcg_temp_free_vec(in0);
+ tcg_temp_free_vec(in1);
+ } else {
+ TCGv_i64 in[4];
+ int j;
+
+ for (j = 0; j < 4; ++j) {
+ in[j] = tcg_temp_new_i64();
+ tcg_gen_ld_i64(in[j], cpu_env, aofs + j * 8);
+ }
+ for (i = (aofs == dofs) * 32; i < oprsz; i += 32) {
+ for (j = 0; j < 4; ++j) {
+ tcg_gen_st_i64(in[j], cpu_env, dofs + i + j * 8);
+ }
+ }
+ for (j = 0; j < 4; ++j) {
+ tcg_temp_free_i64(in[j]);
+ }
+ }
+ if (oprsz < maxsz) {
+ expand_clr(dofs + oprsz, maxsz - oprsz);
+ }
+ } else {
+ g_assert_not_reached();
}
}
--
2.25.1
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PULL 0/5] tcg patch queue
2020-09-03 21:40 [PULL 0/5] tcg patch queue Richard Henderson
` (4 preceding siblings ...)
2020-09-03 21:41 ` [PULL 5/5] tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem Richard Henderson
@ 2020-09-06 13:07 ` Peter Maydell
5 siblings, 0 replies; 15+ messages in thread
From: Peter Maydell @ 2020-09-06 13:07 UTC (permalink / raw)
To: Richard Henderson; +Cc: QEMU Developers
On Thu, 3 Sep 2020 at 22:41, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The following changes since commit 3dd23a4fb8fd72d2220a90a809f213999ffe7f3a:
>
> Merge remote-tracking branch 'remotes/legoater/tags/pull-aspeed-20200901' into staging (2020-09-03 14:12:48 +0100)
>
> are available in the Git repository at:
>
> https://github.com/rth7680/qemu.git tags/pull-tcg-20200903
>
> for you to fetch changes up to fe4b0b5bfa96c38ad1cad0689a86cca9f307e353:
>
> tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem (2020-09-03 13:13:58 -0700)
>
> ----------------------------------------------------------------
> Improve inlining in cputlb.c.
> Fix vector abs fallback.
> Only set parallel_cpus for SMP.
> Add vector dupm for 256-bit elements.
>
> ----------------------------------------------------------------
Applied, thanks.
Please update the changelog at https://wiki.qemu.org/ChangeLog/5.2
for any user-visible changes.
-- PMM
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PULL 3/5] softmmu/cpus: Only set parallel_cpus for SMP
2020-09-03 21:40 ` [PULL 3/5] softmmu/cpus: Only set parallel_cpus for SMP Richard Henderson
@ 2020-09-07 10:05 ` Claudio Fontana
2020-09-07 10:20 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 15+ messages in thread
From: Claudio Fontana @ 2020-09-07 10:05 UTC (permalink / raw)
To: Richard Henderson, qemu-devel
Cc: peter.maydell, Philippe Mathieu-Daudé, Paolo Bonzini
Hi Richard,
currently rebasing on top of this one,
just a question, why is the patch not directly using "current_machine"?
Is using MACHINE(qdev_get_machine()) preferrable here?
Thanks,
Claudio
On 9/3/20 11:40 PM, Richard Henderson wrote:
> Do not set parallel_cpus if there is only one cpu instantiated.
> This will allow tcg to use serial code to implement atomics.
>
> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> softmmu/cpus.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
> index a802e899ab..e3b98065c9 100644
> --- a/softmmu/cpus.c
> +++ b/softmmu/cpus.c
> @@ -1895,6 +1895,16 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
> if (!tcg_region_inited) {
> tcg_region_inited = 1;
> tcg_region_init();
> + /*
> + * If MTTCG, and we will create multiple cpus,
> + * then we will have cpus running in parallel.
> + */
> + if (qemu_tcg_mttcg_enabled()) {
> + MachineState *ms = MACHINE(qdev_get_machine());
MachineState *ms = current_machine;
?
> + if (ms->smp.max_cpus > 1) {
> + parallel_cpus = true;
> + }
> + }
> }
>
> if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
> @@ -1904,7 +1914,6 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>
> if (qemu_tcg_mttcg_enabled()) {
> /* create a thread per vCPU with TCG (MTTCG) */
> - parallel_cpus = true;
> snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
> cpu->cpu_index);
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PULL 3/5] softmmu/cpus: Only set parallel_cpus for SMP
2020-09-07 10:05 ` Claudio Fontana
@ 2020-09-07 10:20 ` Philippe Mathieu-Daudé
2020-09-07 16:30 ` Claudio Fontana
0 siblings, 1 reply; 15+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-07 10:20 UTC (permalink / raw)
To: Claudio Fontana, Richard Henderson, qemu-devel
Cc: peter.maydell, Paolo Bonzini
On 9/7/20 12:05 PM, Claudio Fontana wrote:
> Hi Richard,
>
> currently rebasing on top of this one,
> just a question, why is the patch not directly using "current_machine"?
For user mode?
>
> Is using MACHINE(qdev_get_machine()) preferrable here?
>
> Thanks,
>
> Claudio
>
> On 9/3/20 11:40 PM, Richard Henderson wrote:
>> Do not set parallel_cpus if there is only one cpu instantiated.
>> This will allow tcg to use serial code to implement atomics.
>>
>> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>> softmmu/cpus.c | 11 ++++++++++-
>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
>> index a802e899ab..e3b98065c9 100644
>> --- a/softmmu/cpus.c
>> +++ b/softmmu/cpus.c
>> @@ -1895,6 +1895,16 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>> if (!tcg_region_inited) {
>> tcg_region_inited = 1;
>> tcg_region_init();
>> + /*
>> + * If MTTCG, and we will create multiple cpus,
>> + * then we will have cpus running in parallel.
>> + */
>> + if (qemu_tcg_mttcg_enabled()) {
>> + MachineState *ms = MACHINE(qdev_get_machine());
>
> MachineState *ms = current_machine;
> ?
>
>
>> + if (ms->smp.max_cpus > 1) {
>> + parallel_cpus = true;
>> + }
>> + }
>> }
>>
>> if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
>> @@ -1904,7 +1914,6 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>
>> if (qemu_tcg_mttcg_enabled()) {
>> /* create a thread per vCPU with TCG (MTTCG) */
>> - parallel_cpus = true;
>> snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
>> cpu->cpu_index);
>>
>>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PULL 3/5] softmmu/cpus: Only set parallel_cpus for SMP
2020-09-07 10:20 ` Philippe Mathieu-Daudé
@ 2020-09-07 16:30 ` Claudio Fontana
2020-09-07 16:49 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 15+ messages in thread
From: Claudio Fontana @ 2020-09-07 16:30 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: peter.maydell, Richard Henderson, qemu-devel, Paolo Bonzini
On 9/7/20 12:20 PM, Philippe Mathieu-Daudé wrote:
> On 9/7/20 12:05 PM, Claudio Fontana wrote:
>> Hi Richard,
>>
>> currently rebasing on top of this one,
>> just a question, why is the patch not directly using "current_machine"?
>
> For user mode?
In user mode I'd not expect softmmu/cpus.c to be built at all...
Ciao,
Claudio
>
>>
>> Is using MACHINE(qdev_get_machine()) preferrable here?
>>
>> Thanks,
>>
>> Claudio
>>
>> On 9/3/20 11:40 PM, Richard Henderson wrote:
>>> Do not set parallel_cpus if there is only one cpu instantiated.
>>> This will allow tcg to use serial code to implement atomics.
>>>
>>> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>> ---
>>> softmmu/cpus.c | 11 ++++++++++-
>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
>>> index a802e899ab..e3b98065c9 100644
>>> --- a/softmmu/cpus.c
>>> +++ b/softmmu/cpus.c
>>> @@ -1895,6 +1895,16 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>> if (!tcg_region_inited) {
>>> tcg_region_inited = 1;
>>> tcg_region_init();
>>> + /*
>>> + * If MTTCG, and we will create multiple cpus,
>>> + * then we will have cpus running in parallel.
>>> + */
>>> + if (qemu_tcg_mttcg_enabled()) {
>>> + MachineState *ms = MACHINE(qdev_get_machine());
>>
>> MachineState *ms = current_machine;
>> ?
>>
>>
>>> + if (ms->smp.max_cpus > 1) {
>>> + parallel_cpus = true;
>>> + }
>>> + }
>>> }
>>>
>>> if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
>>> @@ -1904,7 +1914,6 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>>
>>> if (qemu_tcg_mttcg_enabled()) {
>>> /* create a thread per vCPU with TCG (MTTCG) */
>>> - parallel_cpus = true;
>>> snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
>>> cpu->cpu_index);
>>>
>>>
>>
>>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PULL 3/5] softmmu/cpus: Only set parallel_cpus for SMP
2020-09-07 16:30 ` Claudio Fontana
@ 2020-09-07 16:49 ` Philippe Mathieu-Daudé
2020-09-08 7:09 ` Claudio Fontana
0 siblings, 1 reply; 15+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-07 16:49 UTC (permalink / raw)
To: Claudio Fontana
Cc: peter.maydell, Richard Henderson, qemu-devel, Paolo Bonzini
On 9/7/20 6:30 PM, Claudio Fontana wrote:
> On 9/7/20 12:20 PM, Philippe Mathieu-Daudé wrote:
>> On 9/7/20 12:05 PM, Claudio Fontana wrote:
>>> Hi Richard,
>>>
>>> currently rebasing on top of this one,
>>> just a question, why is the patch not directly using "current_machine"?
>>
>> For user mode?
>
> In user mode I'd not expect softmmu/cpus.c to be built at all...
Which is why :) current_machine is NULL in user-mode.
>
> Ciao,
>
> Claudio
>
>>
>>>
>>> Is using MACHINE(qdev_get_machine()) preferrable here?
>>>
>>> Thanks,
>>>
>>> Claudio
>>>
>>> On 9/3/20 11:40 PM, Richard Henderson wrote:
>>>> Do not set parallel_cpus if there is only one cpu instantiated.
>>>> This will allow tcg to use serial code to implement atomics.
>>>>
>>>> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>>> ---
>>>> softmmu/cpus.c | 11 ++++++++++-
>>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
>>>> index a802e899ab..e3b98065c9 100644
>>>> --- a/softmmu/cpus.c
>>>> +++ b/softmmu/cpus.c
>>>> @@ -1895,6 +1895,16 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>>> if (!tcg_region_inited) {
>>>> tcg_region_inited = 1;
>>>> tcg_region_init();
>>>> + /*
>>>> + * If MTTCG, and we will create multiple cpus,
>>>> + * then we will have cpus running in parallel.
>>>> + */
>>>> + if (qemu_tcg_mttcg_enabled()) {
>>>> + MachineState *ms = MACHINE(qdev_get_machine());
>>>
>>> MachineState *ms = current_machine;
>>> ?
>>>
>>>
>>>> + if (ms->smp.max_cpus > 1) {
>>>> + parallel_cpus = true;
>>>> + }
>>>> + }
>>>> }
>>>>
>>>> if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
>>>> @@ -1904,7 +1914,6 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>>>
>>>> if (qemu_tcg_mttcg_enabled()) {
>>>> /* create a thread per vCPU with TCG (MTTCG) */
>>>> - parallel_cpus = true;
>>>> snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
>>>> cpu->cpu_index);
>>>>
>>>>
>>>
>>>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PULL 3/5] softmmu/cpus: Only set parallel_cpus for SMP
2020-09-07 16:49 ` Philippe Mathieu-Daudé
@ 2020-09-08 7:09 ` Claudio Fontana
2020-09-08 7:56 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 15+ messages in thread
From: Claudio Fontana @ 2020-09-08 7:09 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: peter.maydell, Richard Henderson, qemu-devel, Paolo Bonzini
On 9/7/20 6:49 PM, Philippe Mathieu-Daudé wrote:
> On 9/7/20 6:30 PM, Claudio Fontana wrote:
>> On 9/7/20 12:20 PM, Philippe Mathieu-Daudé wrote:
>>> On 9/7/20 12:05 PM, Claudio Fontana wrote:
>>>> Hi Richard,
>>>>
>>>> currently rebasing on top of this one,
>>>> just a question, why is the patch not directly using "current_machine"?
>>>
>>> For user mode?
>>
>> In user mode I'd not expect softmmu/cpus.c to be built at all...
>
> Which is why :) current_machine is NULL in user-mode.
Ciao Philippe,
then why does the patch change softmmu/cpus.c in a way that accounts for user mode?
The function that the patch changes is never called in user mode.
The patch could instead use current_machine without any concern of it being NULL, it will always be set in vl.c .
Ciao,
Claudio
>
>>
>> Ciao,
>>
>> Claudio
>>
>>>
>>>>
>>>> Is using MACHINE(qdev_get_machine()) preferrable here?
>>>>
>>>> Thanks,
>>>>
>>>> Claudio
>>>>
>>>> On 9/3/20 11:40 PM, Richard Henderson wrote:
>>>>> Do not set parallel_cpus if there is only one cpu instantiated.
>>>>> This will allow tcg to use serial code to implement atomics.
>>>>>
>>>>> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>>>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>>>> ---
>>>>> softmmu/cpus.c | 11 ++++++++++-
>>>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
>>>>> index a802e899ab..e3b98065c9 100644
>>>>> --- a/softmmu/cpus.c
>>>>> +++ b/softmmu/cpus.c
>>>>> @@ -1895,6 +1895,16 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>>>> if (!tcg_region_inited) {
>>>>> tcg_region_inited = 1;
>>>>> tcg_region_init();
>>>>> + /*
>>>>> + * If MTTCG, and we will create multiple cpus,
>>>>> + * then we will have cpus running in parallel.
>>>>> + */
>>>>> + if (qemu_tcg_mttcg_enabled()) {
>>>>> + MachineState *ms = MACHINE(qdev_get_machine());
>>>>
>>>> MachineState *ms = current_machine;
>>>> ?
>>>>
>>>>
>>>>> + if (ms->smp.max_cpus > 1) {
>>>>> + parallel_cpus = true;
>>>>> + }
>>>>> + }
>>>>> }
>>>>>
>>>>> if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
>>>>> @@ -1904,7 +1914,6 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>>>>
>>>>> if (qemu_tcg_mttcg_enabled()) {
>>>>> /* create a thread per vCPU with TCG (MTTCG) */
>>>>> - parallel_cpus = true;
>>>>> snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
>>>>> cpu->cpu_index);
>>>>>
>>>>>
>>>>
>>>>
>>
>>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PULL 3/5] softmmu/cpus: Only set parallel_cpus for SMP
2020-09-08 7:09 ` Claudio Fontana
@ 2020-09-08 7:56 ` Philippe Mathieu-Daudé
2020-09-10 8:28 ` Claudio Fontana
0 siblings, 1 reply; 15+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-08 7:56 UTC (permalink / raw)
To: Claudio Fontana
Cc: peter.maydell, Richard Henderson, qemu-devel, Paolo Bonzini
+Laurent
On 9/8/20 9:09 AM, Claudio Fontana wrote:
> On 9/7/20 6:49 PM, Philippe Mathieu-Daudé wrote:
>> On 9/7/20 6:30 PM, Claudio Fontana wrote:
>>> On 9/7/20 12:20 PM, Philippe Mathieu-Daudé wrote:
>>>> On 9/7/20 12:05 PM, Claudio Fontana wrote:
>>>>> Hi Richard,
>>>>>
>>>>> currently rebasing on top of this one,
>>>>> just a question, why is the patch not directly using "current_machine"?
>>>>
>>>> For user mode?
>>>
>>> In user mode I'd not expect softmmu/cpus.c to be built at all...
>>
>> Which is why :) current_machine is NULL in user-mode.
>
> Ciao Philippe,
>
> then why does the patch change softmmu/cpus.c in a way that accounts for user mode?
>
> The function that the patch changes is never called in user mode.
>
> The patch could instead use current_machine without any concern of it being NULL, it will always be set in vl.c .
Better send a patch to prove your point :)
I have bad remember about the last time I tried to understand/modify
that part, because IIRC the user-mode creates some fake machine to
satisfy various QEMU generic code assumptions. Sincerely I now prefer
stay away from this; too many unmerged patches there.
>
> Ciao,
>
> Claudio
>
>>
>>>
>>> Ciao,
>>>
>>> Claudio
>>>
>>>>
>>>>>
>>>>> Is using MACHINE(qdev_get_machine()) preferrable here?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Claudio
>>>>>
>>>>> On 9/3/20 11:40 PM, Richard Henderson wrote:
>>>>>> Do not set parallel_cpus if there is only one cpu instantiated.
>>>>>> This will allow tcg to use serial code to implement atomics.
>>>>>>
>>>>>> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>>>>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>>>>> ---
>>>>>> softmmu/cpus.c | 11 ++++++++++-
>>>>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
>>>>>> index a802e899ab..e3b98065c9 100644
>>>>>> --- a/softmmu/cpus.c
>>>>>> +++ b/softmmu/cpus.c
>>>>>> @@ -1895,6 +1895,16 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>>>>> if (!tcg_region_inited) {
>>>>>> tcg_region_inited = 1;
>>>>>> tcg_region_init();
>>>>>> + /*
>>>>>> + * If MTTCG, and we will create multiple cpus,
>>>>>> + * then we will have cpus running in parallel.
>>>>>> + */
>>>>>> + if (qemu_tcg_mttcg_enabled()) {
>>>>>> + MachineState *ms = MACHINE(qdev_get_machine());
>>>>>
>>>>> MachineState *ms = current_machine;
>>>>> ?
>>>>>
>>>>>
>>>>>> + if (ms->smp.max_cpus > 1) {
>>>>>> + parallel_cpus = true;
>>>>>> + }
>>>>>> + }
>>>>>> }
>>>>>>
>>>>>> if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
>>>>>> @@ -1904,7 +1914,6 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>>>>>
>>>>>> if (qemu_tcg_mttcg_enabled()) {
>>>>>> /* create a thread per vCPU with TCG (MTTCG) */
>>>>>> - parallel_cpus = true;
>>>>>> snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
>>>>>> cpu->cpu_index);
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>
>>>
>
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PULL 3/5] softmmu/cpus: Only set parallel_cpus for SMP
2020-09-08 7:56 ` Philippe Mathieu-Daudé
@ 2020-09-10 8:28 ` Claudio Fontana
2020-09-10 10:32 ` Philippe Mathieu-Daudé
0 siblings, 1 reply; 15+ messages in thread
From: Claudio Fontana @ 2020-09-10 8:28 UTC (permalink / raw)
To: Philippe Mathieu-Daudé
Cc: Laurent Vivier, peter.maydell, Richard Henderson, qemu-devel,
Paolo Bonzini
Hi Philippe,
On 9/8/20 9:56 AM, Philippe Mathieu-Daudé wrote:
> +Laurent
>
> On 9/8/20 9:09 AM, Claudio Fontana wrote:
>> On 9/7/20 6:49 PM, Philippe Mathieu-Daudé wrote:
>>> On 9/7/20 6:30 PM, Claudio Fontana wrote:
>>>> On 9/7/20 12:20 PM, Philippe Mathieu-Daudé wrote:
>>>>> On 9/7/20 12:05 PM, Claudio Fontana wrote:
>>>>>> Hi Richard,
>>>>>>
>>>>>> currently rebasing on top of this one,
>>>>>> just a question, why is the patch not directly using "current_machine"?
>>>>>
>>>>> For user mode?
>>>>
>>>> In user mode I'd not expect softmmu/cpus.c to be built at all...
>>>
>>> Which is why :) current_machine is NULL in user-mode.
>>
>> Ciao Philippe,
>>
>> then why does the patch change softmmu/cpus.c in a way that accounts for user mode?
>>
>> The function that the patch changes is never called in user mode.
>>
>> The patch could instead use current_machine without any concern of it being NULL, it will always be set in vl.c .
>
> Better send a patch to prove your point :)
Yes, I am already testing without problems the cpus-refactoring series with this applied:
commit 53f6db772f1522025650441102b16be17773bdc9
Author: Claudio Fontana <cfontana@suse.de>
Date: Tue Sep 8 10:59:07 2020 +0200
accel/tcg: use current_machine as it is always set for softmmu
current_machine is always set before accelerators are initialized,
so use that instead of MACHINE(qdev_get_machine()).
Signed-off-by: Claudio Fontana <cfontana@suse.de>
diff --git a/accel/tcg/tcg-cpus.c b/accel/tcg/tcg-cpus.c
index ec7158b55e..05af1168a2 100644
--- a/accel/tcg/tcg-cpus.c
+++ b/accel/tcg/tcg-cpus.c
@@ -484,7 +484,7 @@ static void tcg_start_vcpu_thread(CPUState *cpu)
* then we will have cpus running in parallel.
*/
if (qemu_tcg_mttcg_enabled()) {
- MachineState *ms = MACHINE(qdev_get_machine());
+ MachineState *ms = current_machine;
if (ms->smp.max_cpus > 1) {
parallel_cpus = true;
}
>
> I have bad remember about the last time I tried to understand/modify
> that part, because IIRC the user-mode creates some fake machine to
> satisfy various QEMU generic code assumptions. Sincerely I now prefer
> stay away from this; too many unmerged patches there.
linux-user/main.c and bsd-user/main.c seem to use cpu_create() to create the cpus,
then they have their own cpu_loop(), they do not use any of the cpus.c code.
By contrast, softmmu/vl.c initializes current_machine in qemu_init(),
even before accelerator is chosen and configured.
So there is no chance currently that current_machine is not set correctly when
the accelerator vcpu startup code in is called.
Ciao,
CLaudio
>
>>
>> Ciao,
>>
>> Claudio
>>
>>>
>>>>
>>>> Ciao,
>>>>
>>>> Claudio
>>>>
>>>>>
>>>>>>
>>>>>> Is using MACHINE(qdev_get_machine()) preferrable here?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Claudio
>>>>>>
>>>>>> On 9/3/20 11:40 PM, Richard Henderson wrote:
>>>>>>> Do not set parallel_cpus if there is only one cpu instantiated.
>>>>>>> This will allow tcg to use serial code to implement atomics.
>>>>>>>
>>>>>>> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>>>>>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>>>>>> ---
>>>>>>> softmmu/cpus.c | 11 ++++++++++-
>>>>>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/softmmu/cpus.c b/softmmu/cpus.c
>>>>>>> index a802e899ab..e3b98065c9 100644
>>>>>>> --- a/softmmu/cpus.c
>>>>>>> +++ b/softmmu/cpus.c
>>>>>>> @@ -1895,6 +1895,16 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>>>>>> if (!tcg_region_inited) {
>>>>>>> tcg_region_inited = 1;
>>>>>>> tcg_region_init();
>>>>>>> + /*
>>>>>>> + * If MTTCG, and we will create multiple cpus,
>>>>>>> + * then we will have cpus running in parallel.
>>>>>>> + */
>>>>>>> + if (qemu_tcg_mttcg_enabled()) {
>>>>>>> + MachineState *ms = MACHINE(qdev_get_machine());
>>>>>>
>>>>>> MachineState *ms = current_machine;
>>>>>> ?
>>>>>>
>>>>>>
>>>>>>> + if (ms->smp.max_cpus > 1) {
>>>>>>> + parallel_cpus = true;
>>>>>>> + }
>>>>>>> + }
>>>>>>> }
>>>>>>>
>>>>>>> if (qemu_tcg_mttcg_enabled() || !single_tcg_cpu_thread) {
>>>>>>> @@ -1904,7 +1914,6 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
>>>>>>>
>>>>>>> if (qemu_tcg_mttcg_enabled()) {
>>>>>>> /* create a thread per vCPU with TCG (MTTCG) */
>>>>>>> - parallel_cpus = true;
>>>>>>> snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
>>>>>>> cpu->cpu_index);
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PULL 3/5] softmmu/cpus: Only set parallel_cpus for SMP
2020-09-10 8:28 ` Claudio Fontana
@ 2020-09-10 10:32 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 15+ messages in thread
From: Philippe Mathieu-Daudé @ 2020-09-10 10:32 UTC (permalink / raw)
To: Claudio Fontana
Cc: Laurent Vivier, peter.maydell, Richard Henderson, qemu-devel,
Paolo Bonzini
On 9/10/20 10:28 AM, Claudio Fontana wrote:
> Hi Philippe,
>
> On 9/8/20 9:56 AM, Philippe Mathieu-Daudé wrote:
>> +Laurent
>>
>> On 9/8/20 9:09 AM, Claudio Fontana wrote:
>>> On 9/7/20 6:49 PM, Philippe Mathieu-Daudé wrote:
>>>> On 9/7/20 6:30 PM, Claudio Fontana wrote:
>>>>> On 9/7/20 12:20 PM, Philippe Mathieu-Daudé wrote:
>>>>>> On 9/7/20 12:05 PM, Claudio Fontana wrote:
>>>>>>> Hi Richard,
>>>>>>>
>>>>>>> currently rebasing on top of this one,
>>>>>>> just a question, why is the patch not directly using "current_machine"?
>>>>>>
>>>>>> For user mode?
>>>>>
>>>>> In user mode I'd not expect softmmu/cpus.c to be built at all...
>>>>
>>>> Which is why :) current_machine is NULL in user-mode.
>>>
>>> Ciao Philippe,
>>>
>>> then why does the patch change softmmu/cpus.c in a way that accounts for user mode?
>>>
>>> The function that the patch changes is never called in user mode.
>>>
>>> The patch could instead use current_machine without any concern of it being NULL, it will always be set in vl.c .
>>
>> Better send a patch to prove your point :)
>
> Yes, I am already testing without problems the cpus-refactoring series with this applied:
>
> commit 53f6db772f1522025650441102b16be17773bdc9
> Author: Claudio Fontana <cfontana@suse.de>
> Date: Tue Sep 8 10:59:07 2020 +0200
>
> accel/tcg: use current_machine as it is always set for softmmu
>
> current_machine is always set before accelerators are initialized,
> so use that instead of MACHINE(qdev_get_machine()).
>
> Signed-off-by: Claudio Fontana <cfontana@suse.de>
>
> diff --git a/accel/tcg/tcg-cpus.c b/accel/tcg/tcg-cpus.c
> index ec7158b55e..05af1168a2 100644
> --- a/accel/tcg/tcg-cpus.c
> +++ b/accel/tcg/tcg-cpus.c
> @@ -484,7 +484,7 @@ static void tcg_start_vcpu_thread(CPUState *cpu)
> * then we will have cpus running in parallel.
> */
> if (qemu_tcg_mttcg_enabled()) {
> - MachineState *ms = MACHINE(qdev_get_machine());
> + MachineState *ms = current_machine;
> if (ms->smp.max_cpus > 1) {
> parallel_cpus = true;
> }
>
>
>
>>
>> I have bad remember about the last time I tried to understand/modify
>> that part, because IIRC the user-mode creates some fake machine to
>> satisfy various QEMU generic code assumptions. Sincerely I now prefer
>> stay away from this; too many unmerged patches there.
>
>
> linux-user/main.c and bsd-user/main.c seem to use cpu_create() to create the cpus,
> then they have their own cpu_loop(), they do not use any of the cpus.c code.
>
> By contrast, softmmu/vl.c initializes current_machine in qemu_init(),
> even before accelerator is chosen and configured.
>
> So there is no chance currently that current_machine is not set correctly when
> the accelerator vcpu startup code in is called.
Thanks for checking and correcting me!
We are good then :)
>
> Ciao,
>
> CLaudio
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2020-09-10 10:36 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-03 21:40 [PULL 0/5] tcg patch queue Richard Henderson
2020-09-03 21:40 ` [PULL 1/5] cputlb: Make store_helper less fragile to compiler optimizations Richard Henderson
2020-09-03 21:40 ` [PULL 2/5] tcg: Fix tcg gen for vectorized absolute value Richard Henderson
2020-09-03 21:40 ` [PULL 3/5] softmmu/cpus: Only set parallel_cpus for SMP Richard Henderson
2020-09-07 10:05 ` Claudio Fontana
2020-09-07 10:20 ` Philippe Mathieu-Daudé
2020-09-07 16:30 ` Claudio Fontana
2020-09-07 16:49 ` Philippe Mathieu-Daudé
2020-09-08 7:09 ` Claudio Fontana
2020-09-08 7:56 ` Philippe Mathieu-Daudé
2020-09-10 8:28 ` Claudio Fontana
2020-09-10 10:32 ` Philippe Mathieu-Daudé
2020-09-03 21:41 ` [PULL 4/5] tcg: Eliminate one store for in-place 128-bit dup_mem Richard Henderson
2020-09-03 21:41 ` [PULL 5/5] tcg: Implement 256-bit dup for tcg_gen_gvec_dup_mem Richard Henderson
2020-09-06 13:07 ` [PULL 0/5] tcg patch queue Peter Maydell
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).