* [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug
@ 2021-03-14 21:26 Richard Henderson
2021-03-14 21:26 ` [PATCH v2 01/29] meson: Split out tcg/meson.build Richard Henderson
` (30 more replies)
0 siblings, 31 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:26 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Changes for v2:
* Move tcg_init_ctx someplace more private (patch 29)
* Round result of tb_size based on qemu_get_host_physmem (patch 26)
Blurb for v1:
It took a few more patches than imagined to unify the two
places in which we manipulate the tcg code_gen buffer, but
the result is surely cleaner.
There's a lot more that could be done to clean up this part
of tcg too. I tried to not get too side-tracked, but didn't
wholly succeed.
r~
Richard Henderson (29):
meson: Split out tcg/meson.build
meson: Split out fpu/meson.build
tcg: Re-order tcg_region_init vs tcg_prologue_init
tcg: Remove error return from tcg_region_initial_alloc__locked
tcg: Split out tcg_region_initial_alloc
tcg: Split out tcg_region_prologue_set
tcg: Split out region.c
accel/tcg: Inline cpu_gen_init
accel/tcg: Move alloc_code_gen_buffer to tcg/region.c
accel/tcg: Rename tcg_init to tcg_init_machine
tcg: Create tcg_init
accel/tcg: Merge tcg_exec_init into tcg_init_machine
accel/tcg: Pass down max_cpus to tcg_init
tcg: Introduce tcg_max_ctxs
tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h
tcg: Replace region.end with region.total_size
tcg: Rename region.start to region.after_prologue
tcg: Tidy tcg_n_regions
tcg: Tidy split_cross_256mb
tcg: Move in_code_gen_buffer and tests to region.c
tcg: Allocate code_gen_buffer into struct tcg_region_state
tcg: Return the map protection from alloc_code_gen_buffer
tcg: Sink qemu_madvise call to common code
tcg: Do not set guard pages in the rx buffer
util/osdep: Add qemu_mprotect_rw
tcg: Round the tb_size default from qemu_get_host_physmem
tcg: Merge buffer protection and guard page protection
tcg: When allocating for !splitwx, begin with PROT_NONE
tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/
meson.build | 13 +-
accel/tcg/internal.h | 2 +
include/qemu/osdep.h | 1 +
include/sysemu/tcg.h | 2 -
include/tcg/tcg.h | 15 +-
tcg/aarch64/tcg-target.h | 1 +
tcg/arm/tcg-target.h | 1 +
tcg/i386/tcg-target.h | 2 +
tcg/internal.h | 40 ++
tcg/mips/tcg-target.h | 6 +
tcg/ppc/tcg-target.h | 2 +
tcg/riscv/tcg-target.h | 1 +
tcg/s390/tcg-target.h | 3 +
tcg/sparc/tcg-target.h | 1 +
tcg/tci/tcg-target.h | 1 +
accel/tcg/tcg-all.c | 33 +-
accel/tcg/translate-all.c | 439 +----------------
bsd-user/main.c | 1 -
linux-user/main.c | 1 -
tcg/region.c | 991 ++++++++++++++++++++++++++++++++++++++
tcg/tcg.c | 634 ++----------------------
util/osdep.c | 9 +
fpu/meson.build | 1 +
tcg/meson.build | 14 +
24 files changed, 1139 insertions(+), 1075 deletions(-)
create mode 100644 tcg/internal.h
create mode 100644 tcg/region.c
create mode 100644 fpu/meson.build
create mode 100644 tcg/meson.build
--
2.25.1
^ permalink raw reply [flat|nested] 38+ messages in thread
* [PATCH v2 01/29] meson: Split out tcg/meson.build
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
@ 2021-03-14 21:26 ` Richard Henderson
2021-03-15 23:09 ` Roman Bolshakov
2021-03-14 21:26 ` [PATCH v2 02/29] meson: Split out fpu/meson.build Richard Henderson
` (29 subsequent siblings)
30 siblings, 1 reply; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:26 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
meson.build | 9 ++-------
tcg/meson.build | 13 +++++++++++++
2 files changed, 15 insertions(+), 7 deletions(-)
create mode 100644 tcg/meson.build
diff --git a/meson.build b/meson.build
index a7d2dd429d..742f45c8d8 100644
--- a/meson.build
+++ b/meson.build
@@ -1936,14 +1936,8 @@ specific_ss.add(files('cpu.c', 'disas.c', 'gdbstub.c'), capstone)
specific_ss.add(files('exec-vary.c'))
specific_ss.add(when: 'CONFIG_TCG', if_true: files(
'fpu/softfloat.c',
- 'tcg/optimize.c',
- 'tcg/tcg-common.c',
- 'tcg/tcg-op-gvec.c',
- 'tcg/tcg-op-vec.c',
- 'tcg/tcg-op.c',
- 'tcg/tcg.c',
))
-specific_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('disas/tci.c', 'tcg/tci.c'))
+specific_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('disas/tci.c'))
subdir('backends')
subdir('disas')
@@ -1953,6 +1947,7 @@ subdir('net')
subdir('replay')
subdir('semihosting')
subdir('hw')
+subdir('tcg')
subdir('accel')
subdir('plugins')
subdir('bsd-user')
diff --git a/tcg/meson.build b/tcg/meson.build
new file mode 100644
index 0000000000..84064a341e
--- /dev/null
+++ b/tcg/meson.build
@@ -0,0 +1,13 @@
+tcg_ss = ss.source_set()
+
+tcg_ss.add(files(
+ 'optimize.c',
+ 'tcg.c',
+ 'tcg-common.c',
+ 'tcg-op.c',
+ 'tcg-op-gvec.c',
+ 'tcg-op-vec.c',
+))
+tcg_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('tci.c'))
+
+specific_ss.add_all(when: 'CONFIG_TCG', if_true: tcg_ss)
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 02/29] meson: Split out fpu/meson.build
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
2021-03-14 21:26 ` [PATCH v2 01/29] meson: Split out tcg/meson.build Richard Henderson
@ 2021-03-14 21:26 ` Richard Henderson
2021-03-15 23:10 ` Roman Bolshakov
2021-03-14 21:26 ` [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init Richard Henderson
` (28 subsequent siblings)
30 siblings, 1 reply; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:26 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
meson.build | 4 +---
fpu/meson.build | 1 +
2 files changed, 2 insertions(+), 3 deletions(-)
create mode 100644 fpu/meson.build
diff --git a/meson.build b/meson.build
index 742f45c8d8..bfa24b836e 100644
--- a/meson.build
+++ b/meson.build
@@ -1934,9 +1934,6 @@ subdir('softmmu')
common_ss.add(capstone)
specific_ss.add(files('cpu.c', 'disas.c', 'gdbstub.c'), capstone)
specific_ss.add(files('exec-vary.c'))
-specific_ss.add(when: 'CONFIG_TCG', if_true: files(
- 'fpu/softfloat.c',
-))
specific_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('disas/tci.c'))
subdir('backends')
@@ -1948,6 +1945,7 @@ subdir('replay')
subdir('semihosting')
subdir('hw')
subdir('tcg')
+subdir('fpu')
subdir('accel')
subdir('plugins')
subdir('bsd-user')
diff --git a/fpu/meson.build b/fpu/meson.build
new file mode 100644
index 0000000000..1a9992ded5
--- /dev/null
+++ b/fpu/meson.build
@@ -0,0 +1 @@
+specific_ss.add(when: 'CONFIG_TCG', if_true: files('softfloat.c'))
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
2021-03-14 21:26 ` [PATCH v2 01/29] meson: Split out tcg/meson.build Richard Henderson
2021-03-14 21:26 ` [PATCH v2 02/29] meson: Split out fpu/meson.build Richard Henderson
@ 2021-03-14 21:26 ` Richard Henderson
2021-03-15 23:37 ` Roman Bolshakov
2021-03-14 21:26 ` [PATCH v2 04/29] tcg: Remove error return from tcg_region_initial_alloc__locked Richard Henderson
` (27 subsequent siblings)
30 siblings, 1 reply; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:26 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Instead of delaying tcg_region_init until after tcg_prologue_init
is complete, do tcg_region_init first and let tcg_prologue_init
shrink the first region by the size of the generated prologue.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/tcg-all.c | 11 ---------
accel/tcg/translate-all.c | 3 +++
bsd-user/main.c | 1 -
linux-user/main.c | 1 -
tcg/tcg.c | 52 ++++++++++++++-------------------------
5 files changed, 22 insertions(+), 46 deletions(-)
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index e378c2db73..f132033999 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -111,17 +111,6 @@ static int tcg_init(MachineState *ms)
tcg_exec_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
mttcg_enabled = s->mttcg_enabled;
-
- /*
- * Initialize TCG regions only for softmmu.
- *
- * This needs to be done later for user mode, because the prologue
- * generation needs to be delayed so that GUEST_BASE is already set.
- */
-#ifndef CONFIG_USER_ONLY
- tcg_region_init();
-#endif /* !CONFIG_USER_ONLY */
-
return 0;
}
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index f32df8b240..b9057567f4 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1339,6 +1339,9 @@ void tcg_exec_init(unsigned long tb_size, int splitwx)
splitwx, &error_fatal);
assert(ok);
+ /* TODO: allocating regions is hand-in-glove with code_gen_buffer. */
+ tcg_region_init();
+
#if defined(CONFIG_SOFTMMU)
/* There's no guest base to take into account, so go ahead and
initialize the prologue now. */
diff --git a/bsd-user/main.c b/bsd-user/main.c
index 798aba512c..3669d2b89e 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -994,7 +994,6 @@ int main(int argc, char **argv)
generating the prologue until now so that the prologue can take
the real value of GUEST_BASE into account. */
tcg_prologue_init(tcg_ctx);
- tcg_region_init();
/* build Task State */
memset(ts, 0, sizeof(TaskState));
diff --git a/linux-user/main.c b/linux-user/main.c
index 4f4746dce8..1bc48ca954 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -850,7 +850,6 @@ int main(int argc, char **argv, char **envp)
generating the prologue until now so that the prologue can take
the real value of GUEST_BASE into account. */
tcg_prologue_init(tcg_ctx);
- tcg_region_init();
target_cpu_copy_regs(env, regs);
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 2991112829..0a2e5710de 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1204,32 +1204,18 @@ TranslationBlock *tcg_tb_alloc(TCGContext *s)
void tcg_prologue_init(TCGContext *s)
{
- size_t prologue_size, total_size;
- void *buf0, *buf1;
+ size_t prologue_size;
/* Put the prologue at the beginning of code_gen_buffer. */
- buf0 = s->code_gen_buffer;
- total_size = s->code_gen_buffer_size;
- s->code_ptr = buf0;
- s->code_buf = buf0;
+ tcg_region_assign(s, 0);
+ s->code_ptr = s->code_gen_ptr;
+ s->code_buf = s->code_gen_ptr;
s->data_gen_ptr = NULL;
- /*
- * The region trees are not yet configured, but tcg_splitwx_to_rx
- * needs the bounds for an assert.
- */
- region.start = buf0;
- region.end = buf0 + total_size;
-
#ifndef CONFIG_TCG_INTERPRETER
- tcg_qemu_tb_exec = (tcg_prologue_fn *)tcg_splitwx_to_rx(buf0);
+ tcg_qemu_tb_exec = (tcg_prologue_fn *)tcg_splitwx_to_rx(s->code_ptr);
#endif
- /* Compute a high-water mark, at which we voluntarily flush the buffer
- and start over. The size here is arbitrary, significantly larger
- than we expect the code generation for any one opcode to require. */
- s->code_gen_highwater = s->code_gen_buffer + (total_size - TCG_HIGHWATER);
-
#ifdef TCG_TARGET_NEED_POOL_LABELS
s->pool_labels = NULL;
#endif
@@ -1246,32 +1232,32 @@ void tcg_prologue_init(TCGContext *s)
}
#endif
- buf1 = s->code_ptr;
+ prologue_size = tcg_current_code_size(s);
+
#ifndef CONFIG_TCG_INTERPRETER
- flush_idcache_range((uintptr_t)tcg_splitwx_to_rx(buf0), (uintptr_t)buf0,
- tcg_ptr_byte_diff(buf1, buf0));
+ flush_idcache_range((uintptr_t)tcg_splitwx_to_rx(s->code_buf),
+ (uintptr_t)s->code_buf, prologue_size);
#endif
- /* Deduct the prologue from the buffer. */
- prologue_size = tcg_current_code_size(s);
- s->code_gen_ptr = buf1;
- s->code_gen_buffer = buf1;
- s->code_buf = buf1;
- total_size -= prologue_size;
- s->code_gen_buffer_size = total_size;
+ /* Deduct the prologue from the first region. */
+ region.start = s->code_ptr;
- tcg_register_jit(tcg_splitwx_to_rx(s->code_gen_buffer), total_size);
+ /* Recompute boundaries of the first region. */
+ tcg_region_assign(s, 0);
+
+ tcg_register_jit(tcg_splitwx_to_rx(region.start),
+ region.end - region.start);
#ifdef DEBUG_DISAS
if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
FILE *logfile = qemu_log_lock();
qemu_log("PROLOGUE: [size=%zu]\n", prologue_size);
if (s->data_gen_ptr) {
- size_t code_size = s->data_gen_ptr - buf0;
+ size_t code_size = s->data_gen_ptr - s->code_gen_ptr;
size_t data_size = prologue_size - code_size;
size_t i;
- log_disas(buf0, code_size);
+ log_disas(s->code_gen_ptr, code_size);
for (i = 0; i < data_size; i += sizeof(tcg_target_ulong)) {
if (sizeof(tcg_target_ulong) == 8) {
@@ -1285,7 +1271,7 @@ void tcg_prologue_init(TCGContext *s)
}
}
} else {
- log_disas(buf0, prologue_size);
+ log_disas(s->code_gen_ptr, prologue_size);
}
qemu_log("\n");
qemu_log_flush();
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 04/29] tcg: Remove error return from tcg_region_initial_alloc__locked
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (2 preceding siblings ...)
2021-03-14 21:26 ` [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init Richard Henderson
@ 2021-03-14 21:26 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 05/29] tcg: Split out tcg_region_initial_alloc Richard Henderson
` (26 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:26 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé
All callers immediately assert on error, so move the assert
into the function itself.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg.c | 19 ++++++-------------
1 file changed, 6 insertions(+), 13 deletions(-)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 0a2e5710de..2b631fccdf 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -720,9 +720,10 @@ static bool tcg_region_alloc(TCGContext *s)
* Perform a context's first region allocation.
* This function does _not_ increment region.agg_size_full.
*/
-static inline bool tcg_region_initial_alloc__locked(TCGContext *s)
+static void tcg_region_initial_alloc__locked(TCGContext *s)
{
- return tcg_region_alloc__locked(s);
+ bool err = tcg_region_alloc__locked(s);
+ g_assert(!err);
}
/* Call from a safe-work context */
@@ -737,9 +738,7 @@ void tcg_region_reset_all(void)
for (i = 0; i < n_ctxs; i++) {
TCGContext *s = qatomic_read(&tcg_ctxs[i]);
- bool err = tcg_region_initial_alloc__locked(s);
-
- g_assert(!err);
+ tcg_region_initial_alloc__locked(s);
}
qemu_mutex_unlock(®ion.lock);
@@ -874,11 +873,7 @@ void tcg_region_init(void)
/* In user-mode we support only one ctx, so do the initial allocation now */
#ifdef CONFIG_USER_ONLY
- {
- bool err = tcg_region_initial_alloc__locked(tcg_ctx);
-
- g_assert(!err);
- }
+ tcg_region_initial_alloc__locked(tcg_ctx);
#endif
}
@@ -940,7 +935,6 @@ void tcg_register_thread(void)
MachineState *ms = MACHINE(qdev_get_machine());
TCGContext *s = g_malloc(sizeof(*s));
unsigned int i, n;
- bool err;
*s = tcg_init_ctx;
@@ -964,8 +958,7 @@ void tcg_register_thread(void)
tcg_ctx = s;
qemu_mutex_lock(®ion.lock);
- err = tcg_region_initial_alloc__locked(tcg_ctx);
- g_assert(!err);
+ tcg_region_initial_alloc__locked(s);
qemu_mutex_unlock(®ion.lock);
}
#endif /* !CONFIG_USER_ONLY */
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 05/29] tcg: Split out tcg_region_initial_alloc
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (3 preceding siblings ...)
2021-03-14 21:26 ` [PATCH v2 04/29] tcg: Remove error return from tcg_region_initial_alloc__locked Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 06/29] tcg: Split out tcg_region_prologue_set Richard Henderson
` (25 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
This has only one user, and currently needs an ifdef,
but will make more sense after some code motion.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 2b631fccdf..3316a22bde 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -726,6 +726,15 @@ static void tcg_region_initial_alloc__locked(TCGContext *s)
g_assert(!err);
}
+#ifndef CONFIG_USER_ONLY
+static void tcg_region_initial_alloc(TCGContext *s)
+{
+ qemu_mutex_lock(®ion.lock);
+ tcg_region_initial_alloc__locked(s);
+ qemu_mutex_unlock(®ion.lock);
+}
+#endif
+
/* Call from a safe-work context */
void tcg_region_reset_all(void)
{
@@ -957,9 +966,7 @@ void tcg_register_thread(void)
}
tcg_ctx = s;
- qemu_mutex_lock(®ion.lock);
- tcg_region_initial_alloc__locked(s);
- qemu_mutex_unlock(®ion.lock);
+ tcg_region_initial_alloc(s);
}
#endif /* !CONFIG_USER_ONLY */
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 06/29] tcg: Split out tcg_region_prologue_set
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (4 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 05/29] tcg: Split out tcg_region_initial_alloc Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 07/29] tcg: Split out region.c Richard Henderson
` (24 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
This has only one user, but will make more sense after some
code motion.
Always leave the tcg_init_ctx initialized to the first region,
in preparation for tcg_prologue_init(). This also requires
that we don't re-allocate the region for the first cpu, lest
we hit the assertion for total number of regions allocated .
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/tcg.c | 37 ++++++++++++++++++++++---------------
1 file changed, 22 insertions(+), 15 deletions(-)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 3316a22bde..5b3525d52a 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -880,10 +880,26 @@ void tcg_region_init(void)
tcg_region_trees_init();
- /* In user-mode we support only one ctx, so do the initial allocation now */
-#ifdef CONFIG_USER_ONLY
- tcg_region_initial_alloc__locked(tcg_ctx);
-#endif
+ /*
+ * Leave the initial context initialized to the first region.
+ * This will be the context into which we generate the prologue.
+ * It is also the only context for CONFIG_USER_ONLY.
+ */
+ tcg_region_initial_alloc__locked(&tcg_init_ctx);
+}
+
+static void tcg_region_prologue_set(TCGContext *s)
+{
+ /* Deduct the prologue from the first region. */
+ g_assert(region.start == s->code_gen_buffer);
+ region.start = s->code_ptr;
+
+ /* Recompute boundaries of the first region. */
+ tcg_region_assign(s, 0);
+
+ /* Register the balance of the buffer with gdb. */
+ tcg_register_jit(tcg_splitwx_to_rx(region.start),
+ region.end - region.start);
}
#ifdef CONFIG_DEBUG_TCG
@@ -963,10 +979,10 @@ void tcg_register_thread(void)
if (n > 0) {
alloc_tcg_plugin_context(s);
+ tcg_region_initial_alloc(s);
}
tcg_ctx = s;
- tcg_region_initial_alloc(s);
}
#endif /* !CONFIG_USER_ONLY */
@@ -1206,8 +1222,6 @@ void tcg_prologue_init(TCGContext *s)
{
size_t prologue_size;
- /* Put the prologue at the beginning of code_gen_buffer. */
- tcg_region_assign(s, 0);
s->code_ptr = s->code_gen_ptr;
s->code_buf = s->code_gen_ptr;
s->data_gen_ptr = NULL;
@@ -1239,14 +1253,7 @@ void tcg_prologue_init(TCGContext *s)
(uintptr_t)s->code_buf, prologue_size);
#endif
- /* Deduct the prologue from the first region. */
- region.start = s->code_ptr;
-
- /* Recompute boundaries of the first region. */
- tcg_region_assign(s, 0);
-
- tcg_register_jit(tcg_splitwx_to_rx(region.start),
- region.end - region.start);
+ tcg_region_prologue_set(s);
#ifdef DEBUG_DISAS
if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 07/29] tcg: Split out region.c
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (5 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 06/29] tcg: Split out tcg_region_prologue_set Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 08/29] accel/tcg: Inline cpu_gen_init Richard Henderson
` (23 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/internal.h | 37 ++++
tcg/region.c | 570 ++++++++++++++++++++++++++++++++++++++++++++++++
tcg/tcg.c | 545 +--------------------------------------------
tcg/meson.build | 1 +
4 files changed, 611 insertions(+), 542 deletions(-)
create mode 100644 tcg/internal.h
create mode 100644 tcg/region.c
diff --git a/tcg/internal.h b/tcg/internal.h
new file mode 100644
index 0000000000..b1dda343c2
--- /dev/null
+++ b/tcg/internal.h
@@ -0,0 +1,37 @@
+/*
+ * Internal declarations for Tiny Code Generator for QEMU
+ *
+ * Copyright (c) 2008 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef TCG_INTERNAL_H
+#define TCG_INTERNAL_H 1
+
+#define TCG_HIGHWATER 1024
+
+extern TCGContext **tcg_ctxs;
+extern unsigned int n_tcg_ctxs;
+
+bool tcg_region_alloc(TCGContext *s);
+void tcg_region_initial_alloc(TCGContext *s);
+void tcg_region_prologue_set(TCGContext *s);
+
+#endif /* TCG_INTERNAL_H */
diff --git a/tcg/region.c b/tcg/region.c
new file mode 100644
index 0000000000..af45a0174e
--- /dev/null
+++ b/tcg/region.c
@@ -0,0 +1,570 @@
+/*
+ * Memory region management for Tiny Code Generator for QEMU
+ *
+ * Copyright (c) 2008 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "exec/exec-all.h"
+#include "tcg/tcg.h"
+#if !defined(CONFIG_USER_ONLY)
+#include "hw/boards.h"
+#endif
+#include "internal.h"
+
+
+struct tcg_region_tree {
+ QemuMutex lock;
+ GTree *tree;
+ /* padding to avoid false sharing is computed at run-time */
+};
+
+/*
+ * We divide code_gen_buffer into equally-sized "regions" that TCG threads
+ * dynamically allocate from as demand dictates. Given appropriate region
+ * sizing, this minimizes flushes even when some TCG threads generate a lot
+ * more code than others.
+ */
+struct tcg_region_state {
+ QemuMutex lock;
+
+ /* fields set at init time */
+ void *start;
+ void *start_aligned;
+ void *end;
+ size_t n;
+ size_t size; /* size of one region */
+ size_t stride; /* .size + guard size */
+
+ /* fields protected by the lock */
+ size_t current; /* current region index */
+ size_t agg_size_full; /* aggregate size of full regions */
+};
+
+static struct tcg_region_state region;
+
+/*
+ * This is an array of struct tcg_region_tree's, with padding.
+ * We use void * to simplify the computation of region_trees[i]; each
+ * struct is found every tree_size bytes.
+ */
+static void *region_trees;
+static size_t tree_size;
+
+/* compare a pointer @ptr and a tb_tc @s */
+static int ptr_cmp_tb_tc(const void *ptr, const struct tb_tc *s)
+{
+ if (ptr >= s->ptr + s->size) {
+ return 1;
+ } else if (ptr < s->ptr) {
+ return -1;
+ }
+ return 0;
+}
+
+static gint tb_tc_cmp(gconstpointer ap, gconstpointer bp)
+{
+ const struct tb_tc *a = ap;
+ const struct tb_tc *b = bp;
+
+ /*
+ * When both sizes are set, we know this isn't a lookup.
+ * This is the most likely case: every TB must be inserted; lookups
+ * are a lot less frequent.
+ */
+ if (likely(a->size && b->size)) {
+ if (a->ptr > b->ptr) {
+ return 1;
+ } else if (a->ptr < b->ptr) {
+ return -1;
+ }
+ /* a->ptr == b->ptr should happen only on deletions */
+ g_assert(a->size == b->size);
+ return 0;
+ }
+ /*
+ * All lookups have either .size field set to 0.
+ * From the glib sources we see that @ap is always the lookup key. However
+ * the docs provide no guarantee, so we just mark this case as likely.
+ */
+ if (likely(a->size == 0)) {
+ return ptr_cmp_tb_tc(a->ptr, b);
+ }
+ return ptr_cmp_tb_tc(b->ptr, a);
+}
+
+static void tcg_region_trees_init(void)
+{
+ size_t i;
+
+ tree_size = ROUND_UP(sizeof(struct tcg_region_tree), qemu_dcache_linesize);
+ region_trees = qemu_memalign(qemu_dcache_linesize, region.n * tree_size);
+ for (i = 0; i < region.n; i++) {
+ struct tcg_region_tree *rt = region_trees + i * tree_size;
+
+ qemu_mutex_init(&rt->lock);
+ rt->tree = g_tree_new(tb_tc_cmp);
+ }
+}
+
+static struct tcg_region_tree *tc_ptr_to_region_tree(const void *p)
+{
+ size_t region_idx;
+
+ /*
+ * Like tcg_splitwx_to_rw, with no assert. The pc may come from
+ * a signal handler over which the caller has no control.
+ */
+ if (!in_code_gen_buffer(p)) {
+ p -= tcg_splitwx_diff;
+ if (!in_code_gen_buffer(p)) {
+ return NULL;
+ }
+ }
+
+ if (p < region.start_aligned) {
+ region_idx = 0;
+ } else {
+ ptrdiff_t offset = p - region.start_aligned;
+
+ if (offset > region.stride * (region.n - 1)) {
+ region_idx = region.n - 1;
+ } else {
+ region_idx = offset / region.stride;
+ }
+ }
+ return region_trees + region_idx * tree_size;
+}
+
+void tcg_tb_insert(TranslationBlock *tb)
+{
+ struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr);
+
+ g_assert(rt != NULL);
+ qemu_mutex_lock(&rt->lock);
+ g_tree_insert(rt->tree, &tb->tc, tb);
+ qemu_mutex_unlock(&rt->lock);
+}
+
+void tcg_tb_remove(TranslationBlock *tb)
+{
+ struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr);
+
+ g_assert(rt != NULL);
+ qemu_mutex_lock(&rt->lock);
+ g_tree_remove(rt->tree, &tb->tc);
+ qemu_mutex_unlock(&rt->lock);
+}
+
+/*
+ * Find the TB 'tb' such that
+ * tb->tc.ptr <= tc_ptr < tb->tc.ptr + tb->tc.size
+ * Return NULL if not found.
+ */
+TranslationBlock *tcg_tb_lookup(uintptr_t tc_ptr)
+{
+ struct tcg_region_tree *rt = tc_ptr_to_region_tree((void *)tc_ptr);
+ TranslationBlock *tb;
+ struct tb_tc s = { .ptr = (void *)tc_ptr };
+
+ if (rt == NULL) {
+ return NULL;
+ }
+
+ qemu_mutex_lock(&rt->lock);
+ tb = g_tree_lookup(rt->tree, &s);
+ qemu_mutex_unlock(&rt->lock);
+ return tb;
+}
+
+static void tcg_region_tree_lock_all(void)
+{
+ size_t i;
+
+ for (i = 0; i < region.n; i++) {
+ struct tcg_region_tree *rt = region_trees + i * tree_size;
+
+ qemu_mutex_lock(&rt->lock);
+ }
+}
+
+static void tcg_region_tree_unlock_all(void)
+{
+ size_t i;
+
+ for (i = 0; i < region.n; i++) {
+ struct tcg_region_tree *rt = region_trees + i * tree_size;
+
+ qemu_mutex_unlock(&rt->lock);
+ }
+}
+
+void tcg_tb_foreach(GTraverseFunc func, gpointer user_data)
+{
+ size_t i;
+
+ tcg_region_tree_lock_all();
+ for (i = 0; i < region.n; i++) {
+ struct tcg_region_tree *rt = region_trees + i * tree_size;
+
+ g_tree_foreach(rt->tree, func, user_data);
+ }
+ tcg_region_tree_unlock_all();
+}
+
+size_t tcg_nb_tbs(void)
+{
+ size_t nb_tbs = 0;
+ size_t i;
+
+ tcg_region_tree_lock_all();
+ for (i = 0; i < region.n; i++) {
+ struct tcg_region_tree *rt = region_trees + i * tree_size;
+
+ nb_tbs += g_tree_nnodes(rt->tree);
+ }
+ tcg_region_tree_unlock_all();
+ return nb_tbs;
+}
+
+static gboolean tcg_region_tree_traverse(gpointer k, gpointer v, gpointer data)
+{
+ TranslationBlock *tb = v;
+
+ tb_destroy(tb);
+ return FALSE;
+}
+
+static void tcg_region_tree_reset_all(void)
+{
+ size_t i;
+
+ tcg_region_tree_lock_all();
+ for (i = 0; i < region.n; i++) {
+ struct tcg_region_tree *rt = region_trees + i * tree_size;
+
+ g_tree_foreach(rt->tree, tcg_region_tree_traverse, NULL);
+ /* Increment the refcount first so that destroy acts as a reset */
+ g_tree_ref(rt->tree);
+ g_tree_destroy(rt->tree);
+ }
+ tcg_region_tree_unlock_all();
+}
+
+static void tcg_region_bounds(size_t curr_region, void **pstart, void **pend)
+{
+ void *start, *end;
+
+ start = region.start_aligned + curr_region * region.stride;
+ end = start + region.size;
+
+ if (curr_region == 0) {
+ start = region.start;
+ }
+ if (curr_region == region.n - 1) {
+ end = region.end;
+ }
+
+ *pstart = start;
+ *pend = end;
+}
+
+static void tcg_region_assign(TCGContext *s, size_t curr_region)
+{
+ void *start, *end;
+
+ tcg_region_bounds(curr_region, &start, &end);
+
+ s->code_gen_buffer = start;
+ s->code_gen_ptr = start;
+ s->code_gen_buffer_size = end - start;
+ s->code_gen_highwater = end - TCG_HIGHWATER;
+}
+
+static bool tcg_region_alloc__locked(TCGContext *s)
+{
+ if (region.current == region.n) {
+ return true;
+ }
+ tcg_region_assign(s, region.current);
+ region.current++;
+ return false;
+}
+
+/*
+ * Request a new region once the one in use has filled up.
+ * Returns true on error.
+ */
+bool tcg_region_alloc(TCGContext *s)
+{
+ bool err;
+ /* read the region size now; alloc__locked will overwrite it on success */
+ size_t size_full = s->code_gen_buffer_size;
+
+ qemu_mutex_lock(®ion.lock);
+ err = tcg_region_alloc__locked(s);
+ if (!err) {
+ region.agg_size_full += size_full - TCG_HIGHWATER;
+ }
+ qemu_mutex_unlock(®ion.lock);
+ return err;
+}
+
+/*
+ * Perform a context's first region allocation.
+ * This function does _not_ increment region.agg_size_full.
+ */
+static void tcg_region_initial_alloc__locked(TCGContext *s)
+{
+ bool err = tcg_region_alloc__locked(s);
+ g_assert(!err);
+}
+
+void tcg_region_initial_alloc(TCGContext *s)
+{
+ qemu_mutex_lock(®ion.lock);
+ tcg_region_initial_alloc__locked(s);
+ qemu_mutex_unlock(®ion.lock);
+}
+
+/* Call from a safe-work context */
+void tcg_region_reset_all(void)
+{
+ unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+ unsigned int i;
+
+ qemu_mutex_lock(®ion.lock);
+ region.current = 0;
+ region.agg_size_full = 0;
+
+ for (i = 0; i < n_ctxs; i++) {
+ TCGContext *s = qatomic_read(&tcg_ctxs[i]);
+ tcg_region_initial_alloc__locked(s);
+ }
+ qemu_mutex_unlock(®ion.lock);
+
+ tcg_region_tree_reset_all();
+}
+
+#ifdef CONFIG_USER_ONLY
+static size_t tcg_n_regions(void)
+{
+ return 1;
+}
+#else
+/*
+ * It is likely that some vCPUs will translate more code than others, so we
+ * first try to set more regions than max_cpus, with those regions being of
+ * reasonable size. If that's not possible we make do by evenly dividing
+ * the code_gen_buffer among the vCPUs.
+ */
+static size_t tcg_n_regions(void)
+{
+ size_t i;
+
+ /* Use a single region if all we have is one vCPU thread */
+#if !defined(CONFIG_USER_ONLY)
+ MachineState *ms = MACHINE(qdev_get_machine());
+ unsigned int max_cpus = ms->smp.max_cpus;
+#endif
+ if (max_cpus == 1 || !qemu_tcg_mttcg_enabled()) {
+ return 1;
+ }
+
+ /* Try to have more regions than max_cpus, with each region being >= 2 MB */
+ for (i = 8; i > 0; i--) {
+ size_t regions_per_thread = i;
+ size_t region_size;
+
+ region_size = tcg_init_ctx.code_gen_buffer_size;
+ region_size /= max_cpus * regions_per_thread;
+
+ if (region_size >= 2 * 1024u * 1024) {
+ return max_cpus * regions_per_thread;
+ }
+ }
+ /* If we can't, then just allocate one region per vCPU thread */
+ return max_cpus;
+}
+#endif
+
+/*
+ * Initializes region partitioning.
+ *
+ * Called at init time from the parent thread (i.e. the one calling
+ * tcg_context_init), after the target's TCG globals have been set.
+ *
+ * Region partitioning works by splitting code_gen_buffer into separate regions,
+ * and then assigning regions to TCG threads so that the threads can translate
+ * code in parallel without synchronization.
+ *
+ * In softmmu the number of TCG threads is bounded by max_cpus, so we use at
+ * least max_cpus regions in MTTCG. In !MTTCG we use a single region.
+ * Note that the TCG options from the command-line (i.e. -accel accel=tcg,[...])
+ * must have been parsed before calling this function, since it calls
+ * qemu_tcg_mttcg_enabled().
+ *
+ * In user-mode we use a single region. Having multiple regions in user-mode
+ * is not supported, because the number of vCPU threads (recall that each thread
+ * spawned by the guest corresponds to a vCPU thread) is only bounded by the
+ * OS, and usually this number is huge (tens of thousands is not uncommon).
+ * Thus, given this large bound on the number of vCPU threads and the fact
+ * that code_gen_buffer is allocated at compile-time, we cannot guarantee
+ * that the availability of at least one region per vCPU thread.
+ *
+ * However, this user-mode limitation is unlikely to be a significant problem
+ * in practice. Multi-threaded guests share most if not all of their translated
+ * code, which makes parallel code generation less appealing than in softmmu.
+ */
+void tcg_region_init(void)
+{
+ void *buf = tcg_init_ctx.code_gen_buffer;
+ void *aligned;
+ size_t size = tcg_init_ctx.code_gen_buffer_size;
+ size_t page_size = qemu_real_host_page_size;
+ size_t region_size;
+ size_t n_regions;
+ size_t i;
+ uintptr_t splitwx_diff;
+
+ n_regions = tcg_n_regions();
+
+ /* The first region will be 'aligned - buf' bytes larger than the others */
+ aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
+ g_assert(aligned < tcg_init_ctx.code_gen_buffer + size);
+ /*
+ * Make region_size a multiple of page_size, using aligned as the start.
+ * As a result of this we might end up with a few extra pages at the end of
+ * the buffer; we will assign those to the last region.
+ */
+ region_size = (size - (aligned - buf)) / n_regions;
+ region_size = QEMU_ALIGN_DOWN(region_size, page_size);
+
+ /* A region must have at least 2 pages; one code, one guard */
+ g_assert(region_size >= 2 * page_size);
+
+ /* init the region struct */
+ qemu_mutex_init(®ion.lock);
+ region.n = n_regions;
+ region.size = region_size - page_size;
+ region.stride = region_size;
+ region.start = buf;
+ region.start_aligned = aligned;
+ /* page-align the end, since its last page will be a guard page */
+ region.end = QEMU_ALIGN_PTR_DOWN(buf + size, page_size);
+ /* account for that last guard page */
+ region.end -= page_size;
+
+ /* set guard pages */
+ splitwx_diff = tcg_splitwx_diff;
+ for (i = 0; i < region.n; i++) {
+ void *start, *end;
+ int rc;
+
+ tcg_region_bounds(i, &start, &end);
+ rc = qemu_mprotect_none(end, page_size);
+ g_assert(!rc);
+ if (splitwx_diff) {
+ rc = qemu_mprotect_none(end + splitwx_diff, page_size);
+ g_assert(!rc);
+ }
+ }
+
+ tcg_region_trees_init();
+
+ /*
+ * Leave the initial context initialized to the first region.
+ * This will be the context into which we generate the prologue.
+ * It is also the only context for CONFIG_USER_ONLY.
+ */
+ tcg_region_initial_alloc__locked(&tcg_init_ctx);
+}
+
+void tcg_region_prologue_set(TCGContext *s)
+{
+ /* Deduct the prologue from the first region. */
+ g_assert(region.start == s->code_gen_buffer);
+ region.start = s->code_ptr;
+
+ /* Recompute boundaries of the first region. */
+ tcg_region_assign(s, 0);
+
+ /* Register the balance of the buffer with gdb. */
+ tcg_register_jit(tcg_splitwx_to_rx(region.start),
+ region.end - region.start);
+}
+
+/*
+ * Returns the size (in bytes) of all translated code (i.e. from all regions)
+ * currently in the cache.
+ * See also: tcg_code_capacity()
+ * Do not confuse with tcg_current_code_size(); that one applies to a single
+ * TCG context.
+ */
+size_t tcg_code_size(void)
+{
+ unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+ unsigned int i;
+ size_t total;
+
+ qemu_mutex_lock(®ion.lock);
+ total = region.agg_size_full;
+ for (i = 0; i < n_ctxs; i++) {
+ const TCGContext *s = qatomic_read(&tcg_ctxs[i]);
+ size_t size;
+
+ size = qatomic_read(&s->code_gen_ptr) - s->code_gen_buffer;
+ g_assert(size <= s->code_gen_buffer_size);
+ total += size;
+ }
+ qemu_mutex_unlock(®ion.lock);
+ return total;
+}
+
+/*
+ * Returns the code capacity (in bytes) of the entire cache, i.e. including all
+ * regions.
+ * See also: tcg_code_size()
+ */
+size_t tcg_code_capacity(void)
+{
+ size_t guard_size, capacity;
+
+ /* no need for synchronization; these variables are set at init time */
+ guard_size = region.stride - region.size;
+ capacity = region.end + guard_size - region.start;
+ capacity -= region.n * (guard_size + TCG_HIGHWATER);
+ return capacity;
+}
+
+size_t tcg_tb_phys_invalidate_count(void)
+{
+ unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+ unsigned int i;
+ size_t total = 0;
+
+ for (i = 0; i < n_ctxs; i++) {
+ const TCGContext *s = qatomic_read(&tcg_ctxs[i]);
+
+ total += qatomic_read(&s->tb_phys_invalidate_count);
+ }
+ return total;
+}
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 5b3525d52a..10a571d41c 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -65,6 +65,7 @@
#include "elf.h"
#include "exec/log.h"
#include "sysemu/sysemu.h"
+#include "internal.h"
/* Forward declarations for functions declared in tcg-target.c.inc and
used here. */
@@ -153,10 +154,8 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type,
static int tcg_out_ldst_finalize(TCGContext *s);
#endif
-#define TCG_HIGHWATER 1024
-
-static TCGContext **tcg_ctxs;
-static unsigned int n_tcg_ctxs;
+TCGContext **tcg_ctxs;
+unsigned int n_tcg_ctxs;
TCGv_env cpu_env = 0;
const void *tcg_code_gen_epilogue;
uintptr_t tcg_splitwx_diff;
@@ -165,42 +164,6 @@ uintptr_t tcg_splitwx_diff;
tcg_prologue_fn *tcg_qemu_tb_exec;
#endif
-struct tcg_region_tree {
- QemuMutex lock;
- GTree *tree;
- /* padding to avoid false sharing is computed at run-time */
-};
-
-/*
- * We divide code_gen_buffer into equally-sized "regions" that TCG threads
- * dynamically allocate from as demand dictates. Given appropriate region
- * sizing, this minimizes flushes even when some TCG threads generate a lot
- * more code than others.
- */
-struct tcg_region_state {
- QemuMutex lock;
-
- /* fields set at init time */
- void *start;
- void *start_aligned;
- void *end;
- size_t n;
- size_t size; /* size of one region */
- size_t stride; /* .size + guard size */
-
- /* fields protected by the lock */
- size_t current; /* current region index */
- size_t agg_size_full; /* aggregate size of full regions */
-};
-
-static struct tcg_region_state region;
-/*
- * This is an array of struct tcg_region_tree's, with padding.
- * We use void * to simplify the computation of region_trees[i]; each
- * struct is found every tree_size bytes.
- */
-static void *region_trees;
-static size_t tree_size;
static TCGRegSet tcg_target_available_regs[TCG_TYPE_COUNT];
static TCGRegSet tcg_target_call_clobber_regs;
@@ -457,451 +420,6 @@ static const TCGTargetOpDef constraint_sets[] = {
#include "tcg-target.c.inc"
-/* compare a pointer @ptr and a tb_tc @s */
-static int ptr_cmp_tb_tc(const void *ptr, const struct tb_tc *s)
-{
- if (ptr >= s->ptr + s->size) {
- return 1;
- } else if (ptr < s->ptr) {
- return -1;
- }
- return 0;
-}
-
-static gint tb_tc_cmp(gconstpointer ap, gconstpointer bp)
-{
- const struct tb_tc *a = ap;
- const struct tb_tc *b = bp;
-
- /*
- * When both sizes are set, we know this isn't a lookup.
- * This is the most likely case: every TB must be inserted; lookups
- * are a lot less frequent.
- */
- if (likely(a->size && b->size)) {
- if (a->ptr > b->ptr) {
- return 1;
- } else if (a->ptr < b->ptr) {
- return -1;
- }
- /* a->ptr == b->ptr should happen only on deletions */
- g_assert(a->size == b->size);
- return 0;
- }
- /*
- * All lookups have either .size field set to 0.
- * From the glib sources we see that @ap is always the lookup key. However
- * the docs provide no guarantee, so we just mark this case as likely.
- */
- if (likely(a->size == 0)) {
- return ptr_cmp_tb_tc(a->ptr, b);
- }
- return ptr_cmp_tb_tc(b->ptr, a);
-}
-
-static void tcg_region_trees_init(void)
-{
- size_t i;
-
- tree_size = ROUND_UP(sizeof(struct tcg_region_tree), qemu_dcache_linesize);
- region_trees = qemu_memalign(qemu_dcache_linesize, region.n * tree_size);
- for (i = 0; i < region.n; i++) {
- struct tcg_region_tree *rt = region_trees + i * tree_size;
-
- qemu_mutex_init(&rt->lock);
- rt->tree = g_tree_new(tb_tc_cmp);
- }
-}
-
-static struct tcg_region_tree *tc_ptr_to_region_tree(const void *p)
-{
- size_t region_idx;
-
- /*
- * Like tcg_splitwx_to_rw, with no assert. The pc may come from
- * a signal handler over which the caller has no control.
- */
- if (!in_code_gen_buffer(p)) {
- p -= tcg_splitwx_diff;
- if (!in_code_gen_buffer(p)) {
- return NULL;
- }
- }
-
- if (p < region.start_aligned) {
- region_idx = 0;
- } else {
- ptrdiff_t offset = p - region.start_aligned;
-
- if (offset > region.stride * (region.n - 1)) {
- region_idx = region.n - 1;
- } else {
- region_idx = offset / region.stride;
- }
- }
- return region_trees + region_idx * tree_size;
-}
-
-void tcg_tb_insert(TranslationBlock *tb)
-{
- struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr);
-
- g_assert(rt != NULL);
- qemu_mutex_lock(&rt->lock);
- g_tree_insert(rt->tree, &tb->tc, tb);
- qemu_mutex_unlock(&rt->lock);
-}
-
-void tcg_tb_remove(TranslationBlock *tb)
-{
- struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr);
-
- g_assert(rt != NULL);
- qemu_mutex_lock(&rt->lock);
- g_tree_remove(rt->tree, &tb->tc);
- qemu_mutex_unlock(&rt->lock);
-}
-
-/*
- * Find the TB 'tb' such that
- * tb->tc.ptr <= tc_ptr < tb->tc.ptr + tb->tc.size
- * Return NULL if not found.
- */
-TranslationBlock *tcg_tb_lookup(uintptr_t tc_ptr)
-{
- struct tcg_region_tree *rt = tc_ptr_to_region_tree((void *)tc_ptr);
- TranslationBlock *tb;
- struct tb_tc s = { .ptr = (void *)tc_ptr };
-
- if (rt == NULL) {
- return NULL;
- }
-
- qemu_mutex_lock(&rt->lock);
- tb = g_tree_lookup(rt->tree, &s);
- qemu_mutex_unlock(&rt->lock);
- return tb;
-}
-
-static void tcg_region_tree_lock_all(void)
-{
- size_t i;
-
- for (i = 0; i < region.n; i++) {
- struct tcg_region_tree *rt = region_trees + i * tree_size;
-
- qemu_mutex_lock(&rt->lock);
- }
-}
-
-static void tcg_region_tree_unlock_all(void)
-{
- size_t i;
-
- for (i = 0; i < region.n; i++) {
- struct tcg_region_tree *rt = region_trees + i * tree_size;
-
- qemu_mutex_unlock(&rt->lock);
- }
-}
-
-void tcg_tb_foreach(GTraverseFunc func, gpointer user_data)
-{
- size_t i;
-
- tcg_region_tree_lock_all();
- for (i = 0; i < region.n; i++) {
- struct tcg_region_tree *rt = region_trees + i * tree_size;
-
- g_tree_foreach(rt->tree, func, user_data);
- }
- tcg_region_tree_unlock_all();
-}
-
-size_t tcg_nb_tbs(void)
-{
- size_t nb_tbs = 0;
- size_t i;
-
- tcg_region_tree_lock_all();
- for (i = 0; i < region.n; i++) {
- struct tcg_region_tree *rt = region_trees + i * tree_size;
-
- nb_tbs += g_tree_nnodes(rt->tree);
- }
- tcg_region_tree_unlock_all();
- return nb_tbs;
-}
-
-static gboolean tcg_region_tree_traverse(gpointer k, gpointer v, gpointer data)
-{
- TranslationBlock *tb = v;
-
- tb_destroy(tb);
- return FALSE;
-}
-
-static void tcg_region_tree_reset_all(void)
-{
- size_t i;
-
- tcg_region_tree_lock_all();
- for (i = 0; i < region.n; i++) {
- struct tcg_region_tree *rt = region_trees + i * tree_size;
-
- g_tree_foreach(rt->tree, tcg_region_tree_traverse, NULL);
- /* Increment the refcount first so that destroy acts as a reset */
- g_tree_ref(rt->tree);
- g_tree_destroy(rt->tree);
- }
- tcg_region_tree_unlock_all();
-}
-
-static void tcg_region_bounds(size_t curr_region, void **pstart, void **pend)
-{
- void *start, *end;
-
- start = region.start_aligned + curr_region * region.stride;
- end = start + region.size;
-
- if (curr_region == 0) {
- start = region.start;
- }
- if (curr_region == region.n - 1) {
- end = region.end;
- }
-
- *pstart = start;
- *pend = end;
-}
-
-static void tcg_region_assign(TCGContext *s, size_t curr_region)
-{
- void *start, *end;
-
- tcg_region_bounds(curr_region, &start, &end);
-
- s->code_gen_buffer = start;
- s->code_gen_ptr = start;
- s->code_gen_buffer_size = end - start;
- s->code_gen_highwater = end - TCG_HIGHWATER;
-}
-
-static bool tcg_region_alloc__locked(TCGContext *s)
-{
- if (region.current == region.n) {
- return true;
- }
- tcg_region_assign(s, region.current);
- region.current++;
- return false;
-}
-
-/*
- * Request a new region once the one in use has filled up.
- * Returns true on error.
- */
-static bool tcg_region_alloc(TCGContext *s)
-{
- bool err;
- /* read the region size now; alloc__locked will overwrite it on success */
- size_t size_full = s->code_gen_buffer_size;
-
- qemu_mutex_lock(®ion.lock);
- err = tcg_region_alloc__locked(s);
- if (!err) {
- region.agg_size_full += size_full - TCG_HIGHWATER;
- }
- qemu_mutex_unlock(®ion.lock);
- return err;
-}
-
-/*
- * Perform a context's first region allocation.
- * This function does _not_ increment region.agg_size_full.
- */
-static void tcg_region_initial_alloc__locked(TCGContext *s)
-{
- bool err = tcg_region_alloc__locked(s);
- g_assert(!err);
-}
-
-#ifndef CONFIG_USER_ONLY
-static void tcg_region_initial_alloc(TCGContext *s)
-{
- qemu_mutex_lock(®ion.lock);
- tcg_region_initial_alloc__locked(s);
- qemu_mutex_unlock(®ion.lock);
-}
-#endif
-
-/* Call from a safe-work context */
-void tcg_region_reset_all(void)
-{
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
- unsigned int i;
-
- qemu_mutex_lock(®ion.lock);
- region.current = 0;
- region.agg_size_full = 0;
-
- for (i = 0; i < n_ctxs; i++) {
- TCGContext *s = qatomic_read(&tcg_ctxs[i]);
- tcg_region_initial_alloc__locked(s);
- }
- qemu_mutex_unlock(®ion.lock);
-
- tcg_region_tree_reset_all();
-}
-
-#ifdef CONFIG_USER_ONLY
-static size_t tcg_n_regions(void)
-{
- return 1;
-}
-#else
-/*
- * It is likely that some vCPUs will translate more code than others, so we
- * first try to set more regions than max_cpus, with those regions being of
- * reasonable size. If that's not possible we make do by evenly dividing
- * the code_gen_buffer among the vCPUs.
- */
-static size_t tcg_n_regions(void)
-{
- size_t i;
-
- /* Use a single region if all we have is one vCPU thread */
-#if !defined(CONFIG_USER_ONLY)
- MachineState *ms = MACHINE(qdev_get_machine());
- unsigned int max_cpus = ms->smp.max_cpus;
-#endif
- if (max_cpus == 1 || !qemu_tcg_mttcg_enabled()) {
- return 1;
- }
-
- /* Try to have more regions than max_cpus, with each region being >= 2 MB */
- for (i = 8; i > 0; i--) {
- size_t regions_per_thread = i;
- size_t region_size;
-
- region_size = tcg_init_ctx.code_gen_buffer_size;
- region_size /= max_cpus * regions_per_thread;
-
- if (region_size >= 2 * 1024u * 1024) {
- return max_cpus * regions_per_thread;
- }
- }
- /* If we can't, then just allocate one region per vCPU thread */
- return max_cpus;
-}
-#endif
-
-/*
- * Initializes region partitioning.
- *
- * Called at init time from the parent thread (i.e. the one calling
- * tcg_context_init), after the target's TCG globals have been set.
- *
- * Region partitioning works by splitting code_gen_buffer into separate regions,
- * and then assigning regions to TCG threads so that the threads can translate
- * code in parallel without synchronization.
- *
- * In softmmu the number of TCG threads is bounded by max_cpus, so we use at
- * least max_cpus regions in MTTCG. In !MTTCG we use a single region.
- * Note that the TCG options from the command-line (i.e. -accel accel=tcg,[...])
- * must have been parsed before calling this function, since it calls
- * qemu_tcg_mttcg_enabled().
- *
- * In user-mode we use a single region. Having multiple regions in user-mode
- * is not supported, because the number of vCPU threads (recall that each thread
- * spawned by the guest corresponds to a vCPU thread) is only bounded by the
- * OS, and usually this number is huge (tens of thousands is not uncommon).
- * Thus, given this large bound on the number of vCPU threads and the fact
- * that code_gen_buffer is allocated at compile-time, we cannot guarantee
- * that the availability of at least one region per vCPU thread.
- *
- * However, this user-mode limitation is unlikely to be a significant problem
- * in practice. Multi-threaded guests share most if not all of their translated
- * code, which makes parallel code generation less appealing than in softmmu.
- */
-void tcg_region_init(void)
-{
- void *buf = tcg_init_ctx.code_gen_buffer;
- void *aligned;
- size_t size = tcg_init_ctx.code_gen_buffer_size;
- size_t page_size = qemu_real_host_page_size;
- size_t region_size;
- size_t n_regions;
- size_t i;
- uintptr_t splitwx_diff;
-
- n_regions = tcg_n_regions();
-
- /* The first region will be 'aligned - buf' bytes larger than the others */
- aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
- g_assert(aligned < tcg_init_ctx.code_gen_buffer + size);
- /*
- * Make region_size a multiple of page_size, using aligned as the start.
- * As a result of this we might end up with a few extra pages at the end of
- * the buffer; we will assign those to the last region.
- */
- region_size = (size - (aligned - buf)) / n_regions;
- region_size = QEMU_ALIGN_DOWN(region_size, page_size);
-
- /* A region must have at least 2 pages; one code, one guard */
- g_assert(region_size >= 2 * page_size);
-
- /* init the region struct */
- qemu_mutex_init(®ion.lock);
- region.n = n_regions;
- region.size = region_size - page_size;
- region.stride = region_size;
- region.start = buf;
- region.start_aligned = aligned;
- /* page-align the end, since its last page will be a guard page */
- region.end = QEMU_ALIGN_PTR_DOWN(buf + size, page_size);
- /* account for that last guard page */
- region.end -= page_size;
-
- /* set guard pages */
- splitwx_diff = tcg_splitwx_diff;
- for (i = 0; i < region.n; i++) {
- void *start, *end;
- int rc;
-
- tcg_region_bounds(i, &start, &end);
- rc = qemu_mprotect_none(end, page_size);
- g_assert(!rc);
- if (splitwx_diff) {
- rc = qemu_mprotect_none(end + splitwx_diff, page_size);
- g_assert(!rc);
- }
- }
-
- tcg_region_trees_init();
-
- /*
- * Leave the initial context initialized to the first region.
- * This will be the context into which we generate the prologue.
- * It is also the only context for CONFIG_USER_ONLY.
- */
- tcg_region_initial_alloc__locked(&tcg_init_ctx);
-}
-
-static void tcg_region_prologue_set(TCGContext *s)
-{
- /* Deduct the prologue from the first region. */
- g_assert(region.start == s->code_gen_buffer);
- region.start = s->code_ptr;
-
- /* Recompute boundaries of the first region. */
- tcg_region_assign(s, 0);
-
- /* Register the balance of the buffer with gdb. */
- tcg_register_jit(tcg_splitwx_to_rx(region.start),
- region.end - region.start);
-}
-
#ifdef CONFIG_DEBUG_TCG
const void *tcg_splitwx_to_rx(void *rw)
{
@@ -986,63 +504,6 @@ void tcg_register_thread(void)
}
#endif /* !CONFIG_USER_ONLY */
-/*
- * Returns the size (in bytes) of all translated code (i.e. from all regions)
- * currently in the cache.
- * See also: tcg_code_capacity()
- * Do not confuse with tcg_current_code_size(); that one applies to a single
- * TCG context.
- */
-size_t tcg_code_size(void)
-{
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
- unsigned int i;
- size_t total;
-
- qemu_mutex_lock(®ion.lock);
- total = region.agg_size_full;
- for (i = 0; i < n_ctxs; i++) {
- const TCGContext *s = qatomic_read(&tcg_ctxs[i]);
- size_t size;
-
- size = qatomic_read(&s->code_gen_ptr) - s->code_gen_buffer;
- g_assert(size <= s->code_gen_buffer_size);
- total += size;
- }
- qemu_mutex_unlock(®ion.lock);
- return total;
-}
-
-/*
- * Returns the code capacity (in bytes) of the entire cache, i.e. including all
- * regions.
- * See also: tcg_code_size()
- */
-size_t tcg_code_capacity(void)
-{
- size_t guard_size, capacity;
-
- /* no need for synchronization; these variables are set at init time */
- guard_size = region.stride - region.size;
- capacity = region.end + guard_size - region.start;
- capacity -= region.n * (guard_size + TCG_HIGHWATER);
- return capacity;
-}
-
-size_t tcg_tb_phys_invalidate_count(void)
-{
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
- unsigned int i;
- size_t total = 0;
-
- for (i = 0; i < n_ctxs; i++) {
- const TCGContext *s = qatomic_read(&tcg_ctxs[i]);
-
- total += qatomic_read(&s->tb_phys_invalidate_count);
- }
- return total;
-}
-
/* pool based memory allocation */
void *tcg_malloc_internal(TCGContext *s, int size)
{
diff --git a/tcg/meson.build b/tcg/meson.build
index 84064a341e..5be3915529 100644
--- a/tcg/meson.build
+++ b/tcg/meson.build
@@ -2,6 +2,7 @@ tcg_ss = ss.source_set()
tcg_ss.add(files(
'optimize.c',
+ 'region.c',
'tcg.c',
'tcg-common.c',
'tcg-op.c',
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 08/29] accel/tcg: Inline cpu_gen_init
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (6 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 07/29] tcg: Split out region.c Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 09/29] accel/tcg: Move alloc_code_gen_buffer to tcg/region.c Richard Henderson
` (22 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé
It consists of one function call and has only one caller.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/translate-all.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index b9057567f4..6d3184e7da 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -245,11 +245,6 @@ static void page_table_config_init(void)
assert(v_l2_levels >= 0);
}
-static void cpu_gen_init(void)
-{
- tcg_context_init(&tcg_init_ctx);
-}
-
/* Encode VAL as a signed leb128 sequence at P.
Return P incremented past the encoded value. */
static uint8_t *encode_sleb128(uint8_t *p, target_long val)
@@ -1331,7 +1326,7 @@ void tcg_exec_init(unsigned long tb_size, int splitwx)
bool ok;
tcg_allowed = true;
- cpu_gen_init();
+ tcg_context_init(&tcg_init_ctx);
page_init();
tb_htable_init();
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 09/29] accel/tcg: Move alloc_code_gen_buffer to tcg/region.c
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (7 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 08/29] accel/tcg: Inline cpu_gen_init Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 10/29] accel/tcg: Rename tcg_init to tcg_init_machine Richard Henderson
` (21 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Buffer management is integral to tcg. Do not leave the allocation
to code outside of tcg/. This is code movement, with further
cleanups to follow.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/tcg/tcg.h | 2 +-
accel/tcg/translate-all.c | 414 +------------------------------------
tcg/region.c | 421 +++++++++++++++++++++++++++++++++++++-
3 files changed, 418 insertions(+), 419 deletions(-)
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 0f0695e90d..7a435bf807 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -874,7 +874,7 @@ void *tcg_malloc_internal(TCGContext *s, int size);
void tcg_pool_reset(TCGContext *s);
TranslationBlock *tcg_tb_alloc(TCGContext *s);
-void tcg_region_init(void);
+void tcg_region_init(size_t tb_size, int splitwx);
void tb_destroy(TranslationBlock *tb);
void tcg_region_reset_all(void);
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 6d3184e7da..4071edda16 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -18,7 +18,6 @@
*/
#include "qemu/osdep.h"
-#include "qemu/units.h"
#include "qemu-common.h"
#define NO_CPU_IO_DEFS
@@ -51,7 +50,6 @@
#include "exec/tb-hash.h"
#include "exec/translate-all.h"
#include "qemu/bitmap.h"
-#include "qemu/error-report.h"
#include "qemu/qemu-print.h"
#include "qemu/timer.h"
#include "qemu/main-loop.h"
@@ -895,408 +893,6 @@ static void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1,
}
}
-/* Minimum size of the code gen buffer. This number is randomly chosen,
- but not so small that we can't have a fair number of TB's live. */
-#define MIN_CODE_GEN_BUFFER_SIZE (1 * MiB)
-
-/* Maximum size of the code gen buffer we'd like to use. Unless otherwise
- indicated, this is constrained by the range of direct branches on the
- host cpu, as used by the TCG implementation of goto_tb. */
-#if defined(__x86_64__)
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
-#elif defined(__sparc__)
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
-#elif defined(__powerpc64__)
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
-#elif defined(__powerpc__)
-# define MAX_CODE_GEN_BUFFER_SIZE (32 * MiB)
-#elif defined(__aarch64__)
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
-#elif defined(__s390x__)
- /* We have a +- 4GB range on the branches; leave some slop. */
-# define MAX_CODE_GEN_BUFFER_SIZE (3 * GiB)
-#elif defined(__mips__)
- /* We have a 256MB branch region, but leave room to make sure the
- main executable is also within that region. */
-# define MAX_CODE_GEN_BUFFER_SIZE (128 * MiB)
-#else
-# define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
-#endif
-
-#if TCG_TARGET_REG_BITS == 32
-#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (32 * MiB)
-#ifdef CONFIG_USER_ONLY
-/*
- * For user mode on smaller 32 bit systems we may run into trouble
- * allocating big chunks of data in the right place. On these systems
- * we utilise a static code generation buffer directly in the binary.
- */
-#define USE_STATIC_CODE_GEN_BUFFER
-#endif
-#else /* TCG_TARGET_REG_BITS == 64 */
-#ifdef CONFIG_USER_ONLY
-/*
- * As user-mode emulation typically means running multiple instances
- * of the translator don't go too nuts with our default code gen
- * buffer lest we make things too hard for the OS.
- */
-#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (128 * MiB)
-#else
-/*
- * We expect most system emulation to run one or two guests per host.
- * Users running large scale system emulation may want to tweak their
- * runtime setup via the tb-size control on the command line.
- */
-#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB)
-#endif
-#endif
-
-#define DEFAULT_CODE_GEN_BUFFER_SIZE \
- (DEFAULT_CODE_GEN_BUFFER_SIZE_1 < MAX_CODE_GEN_BUFFER_SIZE \
- ? DEFAULT_CODE_GEN_BUFFER_SIZE_1 : MAX_CODE_GEN_BUFFER_SIZE)
-
-static size_t size_code_gen_buffer(size_t tb_size)
-{
- /* Size the buffer. */
- if (tb_size == 0) {
- size_t phys_mem = qemu_get_host_physmem();
- if (phys_mem == 0) {
- tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
- } else {
- tb_size = MIN(DEFAULT_CODE_GEN_BUFFER_SIZE, phys_mem / 8);
- }
- }
- if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
- tb_size = MIN_CODE_GEN_BUFFER_SIZE;
- }
- if (tb_size > MAX_CODE_GEN_BUFFER_SIZE) {
- tb_size = MAX_CODE_GEN_BUFFER_SIZE;
- }
- return tb_size;
-}
-
-#ifdef __mips__
-/* In order to use J and JAL within the code_gen_buffer, we require
- that the buffer not cross a 256MB boundary. */
-static inline bool cross_256mb(void *addr, size_t size)
-{
- return ((uintptr_t)addr ^ ((uintptr_t)addr + size)) & ~0x0ffffffful;
-}
-
-/* We weren't able to allocate a buffer without crossing that boundary,
- so make do with the larger portion of the buffer that doesn't cross.
- Returns the new base of the buffer, and adjusts code_gen_buffer_size. */
-static inline void *split_cross_256mb(void *buf1, size_t size1)
-{
- void *buf2 = (void *)(((uintptr_t)buf1 + size1) & ~0x0ffffffful);
- size_t size2 = buf1 + size1 - buf2;
-
- size1 = buf2 - buf1;
- if (size1 < size2) {
- size1 = size2;
- buf1 = buf2;
- }
-
- tcg_ctx->code_gen_buffer_size = size1;
- return buf1;
-}
-#endif
-
-#ifdef USE_STATIC_CODE_GEN_BUFFER
-static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
- __attribute__((aligned(CODE_GEN_ALIGN)));
-
-static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
-{
- void *buf, *end;
- size_t size;
-
- if (splitwx > 0) {
- error_setg(errp, "jit split-wx not supported");
- return false;
- }
-
- /* page-align the beginning and end of the buffer */
- buf = static_code_gen_buffer;
- end = static_code_gen_buffer + sizeof(static_code_gen_buffer);
- buf = QEMU_ALIGN_PTR_UP(buf, qemu_real_host_page_size);
- end = QEMU_ALIGN_PTR_DOWN(end, qemu_real_host_page_size);
-
- size = end - buf;
-
- /* Honor a command-line option limiting the size of the buffer. */
- if (size > tb_size) {
- size = QEMU_ALIGN_DOWN(tb_size, qemu_real_host_page_size);
- }
- tcg_ctx->code_gen_buffer_size = size;
-
-#ifdef __mips__
- if (cross_256mb(buf, size)) {
- buf = split_cross_256mb(buf, size);
- size = tcg_ctx->code_gen_buffer_size;
- }
-#endif
-
- if (qemu_mprotect_rwx(buf, size)) {
- error_setg_errno(errp, errno, "mprotect of jit buffer");
- return false;
- }
- qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
-
- tcg_ctx->code_gen_buffer = buf;
- return true;
-}
-#elif defined(_WIN32)
-static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
-{
- void *buf;
-
- if (splitwx > 0) {
- error_setg(errp, "jit split-wx not supported");
- return false;
- }
-
- buf = VirtualAlloc(NULL, size, MEM_RESERVE | MEM_COMMIT,
- PAGE_EXECUTE_READWRITE);
- if (buf == NULL) {
- error_setg_win32(errp, GetLastError(),
- "allocate %zu bytes for jit buffer", size);
- return false;
- }
-
- tcg_ctx->code_gen_buffer = buf;
- tcg_ctx->code_gen_buffer_size = size;
- return true;
-}
-#else
-static bool alloc_code_gen_buffer_anon(size_t size, int prot,
- int flags, Error **errp)
-{
- void *buf;
-
- buf = mmap(NULL, size, prot, flags, -1, 0);
- if (buf == MAP_FAILED) {
- error_setg_errno(errp, errno,
- "allocate %zu bytes for jit buffer", size);
- return false;
- }
- tcg_ctx->code_gen_buffer_size = size;
-
-#ifdef __mips__
- if (cross_256mb(buf, size)) {
- /*
- * Try again, with the original still mapped, to avoid re-acquiring
- * the same 256mb crossing.
- */
- size_t size2;
- void *buf2 = mmap(NULL, size, prot, flags, -1, 0);
- switch ((int)(buf2 != MAP_FAILED)) {
- case 1:
- if (!cross_256mb(buf2, size)) {
- /* Success! Use the new buffer. */
- munmap(buf, size);
- break;
- }
- /* Failure. Work with what we had. */
- munmap(buf2, size);
- /* fallthru */
- default:
- /* Split the original buffer. Free the smaller half. */
- buf2 = split_cross_256mb(buf, size);
- size2 = tcg_ctx->code_gen_buffer_size;
- if (buf == buf2) {
- munmap(buf + size2, size - size2);
- } else {
- munmap(buf, size - size2);
- }
- size = size2;
- break;
- }
- buf = buf2;
- }
-#endif
-
- /* Request large pages for the buffer. */
- qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
-
- tcg_ctx->code_gen_buffer = buf;
- return true;
-}
-
-#ifndef CONFIG_TCG_INTERPRETER
-#ifdef CONFIG_POSIX
-#include "qemu/memfd.h"
-
-static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
-{
- void *buf_rw = NULL, *buf_rx = MAP_FAILED;
- int fd = -1;
-
-#ifdef __mips__
- /* Find space for the RX mapping, vs the 256MiB regions. */
- if (!alloc_code_gen_buffer_anon(size, PROT_NONE,
- MAP_PRIVATE | MAP_ANONYMOUS |
- MAP_NORESERVE, errp)) {
- return false;
- }
- /* The size of the mapping may have been adjusted. */
- size = tcg_ctx->code_gen_buffer_size;
- buf_rx = tcg_ctx->code_gen_buffer;
-#endif
-
- buf_rw = qemu_memfd_alloc("tcg-jit", size, 0, &fd, errp);
- if (buf_rw == NULL) {
- goto fail;
- }
-
-#ifdef __mips__
- void *tmp = mmap(buf_rx, size, PROT_READ | PROT_EXEC,
- MAP_SHARED | MAP_FIXED, fd, 0);
- if (tmp != buf_rx) {
- goto fail_rx;
- }
-#else
- buf_rx = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
- if (buf_rx == MAP_FAILED) {
- goto fail_rx;
- }
-#endif
-
- close(fd);
- tcg_ctx->code_gen_buffer = buf_rw;
- tcg_ctx->code_gen_buffer_size = size;
- tcg_splitwx_diff = buf_rx - buf_rw;
-
- /* Request large pages for the buffer and the splitwx. */
- qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
- qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
- return true;
-
- fail_rx:
- error_setg_errno(errp, errno, "failed to map shared memory for execute");
- fail:
- if (buf_rx != MAP_FAILED) {
- munmap(buf_rx, size);
- }
- if (buf_rw) {
- munmap(buf_rw, size);
- }
- if (fd >= 0) {
- close(fd);
- }
- return false;
-}
-#endif /* CONFIG_POSIX */
-
-#ifdef CONFIG_DARWIN
-#include <mach/mach.h>
-
-extern kern_return_t mach_vm_remap(vm_map_t target_task,
- mach_vm_address_t *target_address,
- mach_vm_size_t size,
- mach_vm_offset_t mask,
- int flags,
- vm_map_t src_task,
- mach_vm_address_t src_address,
- boolean_t copy,
- vm_prot_t *cur_protection,
- vm_prot_t *max_protection,
- vm_inherit_t inheritance);
-
-static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
-{
- kern_return_t ret;
- mach_vm_address_t buf_rw, buf_rx;
- vm_prot_t cur_prot, max_prot;
-
- /* Map the read-write portion via normal anon memory. */
- if (!alloc_code_gen_buffer_anon(size, PROT_READ | PROT_WRITE,
- MAP_PRIVATE | MAP_ANONYMOUS, errp)) {
- return false;
- }
-
- buf_rw = (mach_vm_address_t)tcg_ctx->code_gen_buffer;
- buf_rx = 0;
- ret = mach_vm_remap(mach_task_self(),
- &buf_rx,
- size,
- 0,
- VM_FLAGS_ANYWHERE,
- mach_task_self(),
- buf_rw,
- false,
- &cur_prot,
- &max_prot,
- VM_INHERIT_NONE);
- if (ret != KERN_SUCCESS) {
- /* TODO: Convert "ret" to a human readable error message. */
- error_setg(errp, "vm_remap for jit splitwx failed");
- munmap((void *)buf_rw, size);
- return false;
- }
-
- if (mprotect((void *)buf_rx, size, PROT_READ | PROT_EXEC) != 0) {
- error_setg_errno(errp, errno, "mprotect for jit splitwx");
- munmap((void *)buf_rx, size);
- munmap((void *)buf_rw, size);
- return false;
- }
-
- tcg_splitwx_diff = buf_rx - buf_rw;
- return true;
-}
-#endif /* CONFIG_DARWIN */
-#endif /* CONFIG_TCG_INTERPRETER */
-
-static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
-{
-#ifndef CONFIG_TCG_INTERPRETER
-# ifdef CONFIG_DARWIN
- return alloc_code_gen_buffer_splitwx_vmremap(size, errp);
-# endif
-# ifdef CONFIG_POSIX
- return alloc_code_gen_buffer_splitwx_memfd(size, errp);
-# endif
-#endif
- error_setg(errp, "jit split-wx not supported");
- return false;
-}
-
-static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
-{
- ERRP_GUARD();
- int prot, flags;
-
- if (splitwx) {
- if (alloc_code_gen_buffer_splitwx(size, errp)) {
- return true;
- }
- /*
- * If splitwx force-on (1), fail;
- * if splitwx default-on (-1), fall through to splitwx off.
- */
- if (splitwx > 0) {
- return false;
- }
- error_free_or_abort(errp);
- }
-
- prot = PROT_READ | PROT_WRITE | PROT_EXEC;
- flags = MAP_PRIVATE | MAP_ANONYMOUS;
-#ifdef CONFIG_TCG_INTERPRETER
- /* The tcg interpreter does not need execute permission. */
- prot = PROT_READ | PROT_WRITE;
-#elif defined(CONFIG_DARWIN)
- /* Applicable to both iOS and macOS (Apple Silicon). */
- if (!splitwx) {
- flags |= MAP_JIT;
- }
-#endif
-
- return alloc_code_gen_buffer_anon(size, prot, flags, errp);
-}
-#endif /* USE_STATIC_CODE_GEN_BUFFER, WIN32, POSIX */
-
static bool tb_cmp(const void *ap, const void *bp)
{
const TranslationBlock *a = ap;
@@ -1323,19 +919,11 @@ static void tb_htable_init(void)
size. */
void tcg_exec_init(unsigned long tb_size, int splitwx)
{
- bool ok;
-
tcg_allowed = true;
tcg_context_init(&tcg_init_ctx);
page_init();
tb_htable_init();
-
- ok = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
- splitwx, &error_fatal);
- assert(ok);
-
- /* TODO: allocating regions is hand-in-glove with code_gen_buffer. */
- tcg_region_init();
+ tcg_region_init(tb_size, splitwx);
#if defined(CONFIG_SOFTMMU)
/* There's no guest base to take into account, so go ahead and
diff --git a/tcg/region.c b/tcg/region.c
index af45a0174e..8d88144a22 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -23,6 +23,8 @@
*/
#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
#include "exec/exec-all.h"
#include "tcg/tcg.h"
#if !defined(CONFIG_USER_ONLY)
@@ -406,6 +408,408 @@ static size_t tcg_n_regions(void)
}
#endif
+/* Minimum size of the code gen buffer. This number is randomly chosen,
+ but not so small that we can't have a fair number of TB's live. */
+#define MIN_CODE_GEN_BUFFER_SIZE (1 * MiB)
+
+/* Maximum size of the code gen buffer we'd like to use. Unless otherwise
+ indicated, this is constrained by the range of direct branches on the
+ host cpu, as used by the TCG implementation of goto_tb. */
+#if defined(__x86_64__)
+# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
+#elif defined(__sparc__)
+# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
+#elif defined(__powerpc64__)
+# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
+#elif defined(__powerpc__)
+# define MAX_CODE_GEN_BUFFER_SIZE (32 * MiB)
+#elif defined(__aarch64__)
+# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
+#elif defined(__s390x__)
+ /* We have a +- 4GB range on the branches; leave some slop. */
+# define MAX_CODE_GEN_BUFFER_SIZE (3 * GiB)
+#elif defined(__mips__)
+ /* We have a 256MB branch region, but leave room to make sure the
+ main executable is also within that region. */
+# define MAX_CODE_GEN_BUFFER_SIZE (128 * MiB)
+#else
+# define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
+#endif
+
+#if TCG_TARGET_REG_BITS == 32
+#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (32 * MiB)
+#ifdef CONFIG_USER_ONLY
+/*
+ * For user mode on smaller 32 bit systems we may run into trouble
+ * allocating big chunks of data in the right place. On these systems
+ * we utilise a static code generation buffer directly in the binary.
+ */
+#define USE_STATIC_CODE_GEN_BUFFER
+#endif
+#else /* TCG_TARGET_REG_BITS == 64 */
+#ifdef CONFIG_USER_ONLY
+/*
+ * As user-mode emulation typically means running multiple instances
+ * of the translator don't go too nuts with our default code gen
+ * buffer lest we make things too hard for the OS.
+ */
+#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (128 * MiB)
+#else
+/*
+ * We expect most system emulation to run one or two guests per host.
+ * Users running large scale system emulation may want to tweak their
+ * runtime setup via the tb-size control on the command line.
+ */
+#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB)
+#endif
+#endif
+
+#define DEFAULT_CODE_GEN_BUFFER_SIZE \
+ (DEFAULT_CODE_GEN_BUFFER_SIZE_1 < MAX_CODE_GEN_BUFFER_SIZE \
+ ? DEFAULT_CODE_GEN_BUFFER_SIZE_1 : MAX_CODE_GEN_BUFFER_SIZE)
+
+static size_t size_code_gen_buffer(size_t tb_size)
+{
+ /* Size the buffer. */
+ if (tb_size == 0) {
+ size_t phys_mem = qemu_get_host_physmem();
+ if (phys_mem == 0) {
+ tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
+ } else {
+ tb_size = MIN(DEFAULT_CODE_GEN_BUFFER_SIZE, phys_mem / 8);
+ }
+ }
+ if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
+ tb_size = MIN_CODE_GEN_BUFFER_SIZE;
+ }
+ if (tb_size > MAX_CODE_GEN_BUFFER_SIZE) {
+ tb_size = MAX_CODE_GEN_BUFFER_SIZE;
+ }
+ return tb_size;
+}
+
+#ifdef __mips__
+/* In order to use J and JAL within the code_gen_buffer, we require
+ that the buffer not cross a 256MB boundary. */
+static inline bool cross_256mb(void *addr, size_t size)
+{
+ return ((uintptr_t)addr ^ ((uintptr_t)addr + size)) & ~0x0ffffffful;
+}
+
+/* We weren't able to allocate a buffer without crossing that boundary,
+ so make do with the larger portion of the buffer that doesn't cross.
+ Returns the new base of the buffer, and adjusts code_gen_buffer_size. */
+static inline void *split_cross_256mb(void *buf1, size_t size1)
+{
+ void *buf2 = (void *)(((uintptr_t)buf1 + size1) & ~0x0ffffffful);
+ size_t size2 = buf1 + size1 - buf2;
+
+ size1 = buf2 - buf1;
+ if (size1 < size2) {
+ size1 = size2;
+ buf1 = buf2;
+ }
+
+ tcg_ctx->code_gen_buffer_size = size1;
+ return buf1;
+}
+#endif
+
+#ifdef USE_STATIC_CODE_GEN_BUFFER
+static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
+ __attribute__((aligned(CODE_GEN_ALIGN)));
+
+static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
+{
+ void *buf, *end;
+ size_t size;
+
+ if (splitwx > 0) {
+ error_setg(errp, "jit split-wx not supported");
+ return false;
+ }
+
+ /* page-align the beginning and end of the buffer */
+ buf = static_code_gen_buffer;
+ end = static_code_gen_buffer + sizeof(static_code_gen_buffer);
+ buf = QEMU_ALIGN_PTR_UP(buf, qemu_real_host_page_size);
+ end = QEMU_ALIGN_PTR_DOWN(end, qemu_real_host_page_size);
+
+ size = end - buf;
+
+ /* Honor a command-line option limiting the size of the buffer. */
+ if (size > tb_size) {
+ size = QEMU_ALIGN_DOWN(tb_size, qemu_real_host_page_size);
+ }
+ tcg_ctx->code_gen_buffer_size = size;
+
+#ifdef __mips__
+ if (cross_256mb(buf, size)) {
+ buf = split_cross_256mb(buf, size);
+ size = tcg_ctx->code_gen_buffer_size;
+ }
+#endif
+
+ if (qemu_mprotect_rwx(buf, size)) {
+ error_setg_errno(errp, errno, "mprotect of jit buffer");
+ return false;
+ }
+ qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
+
+ tcg_ctx->code_gen_buffer = buf;
+ return true;
+}
+#elif defined(_WIN32)
+static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
+{
+ void *buf;
+
+ if (splitwx > 0) {
+ error_setg(errp, "jit split-wx not supported");
+ return false;
+ }
+
+ buf = VirtualAlloc(NULL, size, MEM_RESERVE | MEM_COMMIT,
+ PAGE_EXECUTE_READWRITE);
+ if (buf == NULL) {
+ error_setg_win32(errp, GetLastError(),
+ "allocate %zu bytes for jit buffer", size);
+ return false;
+ }
+
+ tcg_ctx->code_gen_buffer = buf;
+ tcg_ctx->code_gen_buffer_size = size;
+ return true;
+}
+#else
+static bool alloc_code_gen_buffer_anon(size_t size, int prot,
+ int flags, Error **errp)
+{
+ void *buf;
+
+ buf = mmap(NULL, size, prot, flags, -1, 0);
+ if (buf == MAP_FAILED) {
+ error_setg_errno(errp, errno,
+ "allocate %zu bytes for jit buffer", size);
+ return false;
+ }
+ tcg_ctx->code_gen_buffer_size = size;
+
+#ifdef __mips__
+ if (cross_256mb(buf, size)) {
+ /*
+ * Try again, with the original still mapped, to avoid re-acquiring
+ * the same 256mb crossing.
+ */
+ size_t size2;
+ void *buf2 = mmap(NULL, size, prot, flags, -1, 0);
+ switch ((int)(buf2 != MAP_FAILED)) {
+ case 1:
+ if (!cross_256mb(buf2, size)) {
+ /* Success! Use the new buffer. */
+ munmap(buf, size);
+ break;
+ }
+ /* Failure. Work with what we had. */
+ munmap(buf2, size);
+ /* fallthru */
+ default:
+ /* Split the original buffer. Free the smaller half. */
+ buf2 = split_cross_256mb(buf, size);
+ size2 = tcg_ctx->code_gen_buffer_size;
+ if (buf == buf2) {
+ munmap(buf + size2, size - size2);
+ } else {
+ munmap(buf, size - size2);
+ }
+ size = size2;
+ break;
+ }
+ buf = buf2;
+ }
+#endif
+
+ /* Request large pages for the buffer. */
+ qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
+
+ tcg_ctx->code_gen_buffer = buf;
+ return true;
+}
+
+#ifndef CONFIG_TCG_INTERPRETER
+#ifdef CONFIG_POSIX
+#include "qemu/memfd.h"
+
+static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
+{
+ void *buf_rw = NULL, *buf_rx = MAP_FAILED;
+ int fd = -1;
+
+#ifdef __mips__
+ /* Find space for the RX mapping, vs the 256MiB regions. */
+ if (!alloc_code_gen_buffer_anon(size, PROT_NONE,
+ MAP_PRIVATE | MAP_ANONYMOUS |
+ MAP_NORESERVE, errp)) {
+ return false;
+ }
+ /* The size of the mapping may have been adjusted. */
+ size = tcg_ctx->code_gen_buffer_size;
+ buf_rx = tcg_ctx->code_gen_buffer;
+#endif
+
+ buf_rw = qemu_memfd_alloc("tcg-jit", size, 0, &fd, errp);
+ if (buf_rw == NULL) {
+ goto fail;
+ }
+
+#ifdef __mips__
+ void *tmp = mmap(buf_rx, size, PROT_READ | PROT_EXEC,
+ MAP_SHARED | MAP_FIXED, fd, 0);
+ if (tmp != buf_rx) {
+ goto fail_rx;
+ }
+#else
+ buf_rx = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
+ if (buf_rx == MAP_FAILED) {
+ goto fail_rx;
+ }
+#endif
+
+ close(fd);
+ tcg_ctx->code_gen_buffer = buf_rw;
+ tcg_ctx->code_gen_buffer_size = size;
+ tcg_splitwx_diff = buf_rx - buf_rw;
+
+ /* Request large pages for the buffer and the splitwx. */
+ qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
+ qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
+ return true;
+
+ fail_rx:
+ error_setg_errno(errp, errno, "failed to map shared memory for execute");
+ fail:
+ if (buf_rx != MAP_FAILED) {
+ munmap(buf_rx, size);
+ }
+ if (buf_rw) {
+ munmap(buf_rw, size);
+ }
+ if (fd >= 0) {
+ close(fd);
+ }
+ return false;
+}
+#endif /* CONFIG_POSIX */
+
+#ifdef CONFIG_DARWIN
+#include <mach/mach.h>
+
+extern kern_return_t mach_vm_remap(vm_map_t target_task,
+ mach_vm_address_t *target_address,
+ mach_vm_size_t size,
+ mach_vm_offset_t mask,
+ int flags,
+ vm_map_t src_task,
+ mach_vm_address_t src_address,
+ boolean_t copy,
+ vm_prot_t *cur_protection,
+ vm_prot_t *max_protection,
+ vm_inherit_t inheritance);
+
+static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
+{
+ kern_return_t ret;
+ mach_vm_address_t buf_rw, buf_rx;
+ vm_prot_t cur_prot, max_prot;
+
+ /* Map the read-write portion via normal anon memory. */
+ if (!alloc_code_gen_buffer_anon(size, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, errp)) {
+ return false;
+ }
+
+ buf_rw = (mach_vm_address_t)tcg_ctx->code_gen_buffer;
+ buf_rx = 0;
+ ret = mach_vm_remap(mach_task_self(),
+ &buf_rx,
+ size,
+ 0,
+ VM_FLAGS_ANYWHERE,
+ mach_task_self(),
+ buf_rw,
+ false,
+ &cur_prot,
+ &max_prot,
+ VM_INHERIT_NONE);
+ if (ret != KERN_SUCCESS) {
+ /* TODO: Convert "ret" to a human readable error message. */
+ error_setg(errp, "vm_remap for jit splitwx failed");
+ munmap((void *)buf_rw, size);
+ return false;
+ }
+
+ if (mprotect((void *)buf_rx, size, PROT_READ | PROT_EXEC) != 0) {
+ error_setg_errno(errp, errno, "mprotect for jit splitwx");
+ munmap((void *)buf_rx, size);
+ munmap((void *)buf_rw, size);
+ return false;
+ }
+
+ tcg_splitwx_diff = buf_rx - buf_rw;
+ return true;
+}
+#endif /* CONFIG_DARWIN */
+#endif /* CONFIG_TCG_INTERPRETER */
+
+static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
+{
+#ifndef CONFIG_TCG_INTERPRETER
+# ifdef CONFIG_DARWIN
+ return alloc_code_gen_buffer_splitwx_vmremap(size, errp);
+# endif
+# ifdef CONFIG_POSIX
+ return alloc_code_gen_buffer_splitwx_memfd(size, errp);
+# endif
+#endif
+ error_setg(errp, "jit split-wx not supported");
+ return false;
+}
+
+static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
+{
+ ERRP_GUARD();
+ int prot, flags;
+
+ if (splitwx) {
+ if (alloc_code_gen_buffer_splitwx(size, errp)) {
+ return true;
+ }
+ /*
+ * If splitwx force-on (1), fail;
+ * if splitwx default-on (-1), fall through to splitwx off.
+ */
+ if (splitwx > 0) {
+ return false;
+ }
+ error_free_or_abort(errp);
+ }
+
+ prot = PROT_READ | PROT_WRITE | PROT_EXEC;
+ flags = MAP_PRIVATE | MAP_ANONYMOUS;
+#ifdef CONFIG_TCG_INTERPRETER
+ /* The tcg interpreter does not need execute permission. */
+ prot = PROT_READ | PROT_WRITE;
+#elif defined(CONFIG_DARWIN)
+ /* Applicable to both iOS and macOS (Apple Silicon). */
+ if (!splitwx) {
+ flags |= MAP_JIT;
+ }
+#endif
+
+ return alloc_code_gen_buffer_anon(size, prot, flags, errp);
+}
+#endif /* USE_STATIC_CODE_GEN_BUFFER, WIN32, POSIX */
+
/*
* Initializes region partitioning.
*
@@ -434,17 +838,24 @@ static size_t tcg_n_regions(void)
* in practice. Multi-threaded guests share most if not all of their translated
* code, which makes parallel code generation less appealing than in softmmu.
*/
-void tcg_region_init(void)
+void tcg_region_init(size_t tb_size, int splitwx)
{
- void *buf = tcg_init_ctx.code_gen_buffer;
- void *aligned;
- size_t size = tcg_init_ctx.code_gen_buffer_size;
- size_t page_size = qemu_real_host_page_size;
+ void *buf, *aligned;
+ size_t size;
+ size_t page_size;
size_t region_size;
size_t n_regions;
size_t i;
uintptr_t splitwx_diff;
+ bool ok;
+ ok = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
+ splitwx, &error_fatal);
+ assert(ok);
+
+ buf = tcg_init_ctx.code_gen_buffer;
+ size = tcg_init_ctx.code_gen_buffer_size;
+ page_size = qemu_real_host_page_size;
n_regions = tcg_n_regions();
/* The first region will be 'aligned - buf' bytes larger than the others */
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 10/29] accel/tcg: Rename tcg_init to tcg_init_machine
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (8 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 09/29] accel/tcg: Move alloc_code_gen_buffer to tcg/region.c Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 11/29] tcg: Create tcg_init Richard Henderson
` (20 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé
We shortly want to use tcg_init for something else.
Since the hook is called init_machine, match that.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/tcg-all.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index f132033999..30d81ff7f5 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -105,7 +105,7 @@ static void tcg_accel_instance_init(Object *obj)
bool mttcg_enabled;
-static int tcg_init(MachineState *ms)
+static int tcg_init_machine(MachineState *ms)
{
TCGState *s = TCG_STATE(current_accel());
@@ -189,7 +189,7 @@ static void tcg_accel_class_init(ObjectClass *oc, void *data)
{
AccelClass *ac = ACCEL_CLASS(oc);
ac->name = "tcg";
- ac->init_machine = tcg_init;
+ ac->init_machine = tcg_init_machine;
ac->allowed = &tcg_allowed;
object_class_property_add_str(oc, "thread",
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 11/29] tcg: Create tcg_init
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (9 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 10/29] accel/tcg: Rename tcg_init to tcg_init_machine Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 12/29] accel/tcg: Merge tcg_exec_init into tcg_init_machine Richard Henderson
` (19 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Perform both tcg_context_init and tcg_region_init.
Do not leave this split to the caller.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/tcg/tcg.h | 3 +--
tcg/internal.h | 1 +
accel/tcg/translate-all.c | 3 +--
tcg/tcg.c | 9 ++++++++-
4 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 7a435bf807..3ad77ec34d 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -874,7 +874,6 @@ void *tcg_malloc_internal(TCGContext *s, int size);
void tcg_pool_reset(TCGContext *s);
TranslationBlock *tcg_tb_alloc(TCGContext *s);
-void tcg_region_init(size_t tb_size, int splitwx);
void tb_destroy(TranslationBlock *tb);
void tcg_region_reset_all(void);
@@ -907,7 +906,7 @@ static inline void *tcg_malloc(int size)
}
}
-void tcg_context_init(TCGContext *s);
+void tcg_init(size_t tb_size, int splitwx);
void tcg_register_thread(void);
void tcg_prologue_init(TCGContext *s);
void tcg_func_start(TCGContext *s);
diff --git a/tcg/internal.h b/tcg/internal.h
index b1dda343c2..f13c564d9b 100644
--- a/tcg/internal.h
+++ b/tcg/internal.h
@@ -30,6 +30,7 @@
extern TCGContext **tcg_ctxs;
extern unsigned int n_tcg_ctxs;
+void tcg_region_init(size_t tb_size, int splitwx);
bool tcg_region_alloc(TCGContext *s);
void tcg_region_initial_alloc(TCGContext *s);
void tcg_region_prologue_set(TCGContext *s);
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 4071edda16..050b4bff46 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -920,10 +920,9 @@ static void tb_htable_init(void)
void tcg_exec_init(unsigned long tb_size, int splitwx)
{
tcg_allowed = true;
- tcg_context_init(&tcg_init_ctx);
page_init();
tb_htable_init();
- tcg_region_init(tb_size, splitwx);
+ tcg_init(tb_size, splitwx);
#if defined(CONFIG_SOFTMMU)
/* There's no guest base to take into account, so go ahead and
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 10a571d41c..65a63bda8a 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -576,8 +576,9 @@ static void process_op_defs(TCGContext *s);
static TCGTemp *tcg_global_reg_new_internal(TCGContext *s, TCGType type,
TCGReg reg, const char *name);
-void tcg_context_init(TCGContext *s)
+static void tcg_context_init(void)
{
+ TCGContext *s = &tcg_init_ctx;
int op, total_args, n, i;
TCGOpDef *def;
TCGArgConstraint *args_ct;
@@ -654,6 +655,12 @@ void tcg_context_init(TCGContext *s)
cpu_env = temp_tcgv_ptr(ts);
}
+void tcg_init(size_t tb_size, int splitwx)
+{
+ tcg_context_init();
+ tcg_region_init(tb_size, splitwx);
+}
+
/*
* Allocate TBs right before their corresponding translated code, making
* sure that TBs and code are on different cache lines.
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 12/29] accel/tcg: Merge tcg_exec_init into tcg_init_machine
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (10 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 11/29] tcg: Create tcg_init Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 13/29] accel/tcg: Pass down max_cpus to tcg_init Richard Henderson
` (18 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
There is only one caller, and shortly we will need access
to the MachineState, which tcg_init_machine already has.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/internal.h | 2 ++
include/sysemu/tcg.h | 2 --
accel/tcg/tcg-all.c | 14 +++++++++++++-
accel/tcg/translate-all.c | 21 ++-------------------
4 files changed, 17 insertions(+), 22 deletions(-)
diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
index e9c145e0fb..881bc1ede0 100644
--- a/accel/tcg/internal.h
+++ b/accel/tcg/internal.h
@@ -16,5 +16,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu, target_ulong pc,
int cflags);
void QEMU_NORETURN cpu_io_recompile(CPUState *cpu, uintptr_t retaddr);
+void page_init(void);
+void tb_htable_init(void);
#endif /* ACCEL_TCG_INTERNAL_H */
diff --git a/include/sysemu/tcg.h b/include/sysemu/tcg.h
index 00349fb18a..53352450ff 100644
--- a/include/sysemu/tcg.h
+++ b/include/sysemu/tcg.h
@@ -8,8 +8,6 @@
#ifndef SYSEMU_TCG_H
#define SYSEMU_TCG_H
-void tcg_exec_init(unsigned long tb_size, int splitwx);
-
#ifdef CONFIG_TCG
extern bool tcg_allowed;
#define tcg_enabled() (tcg_allowed)
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index 30d81ff7f5..0e83acbfe5 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -32,6 +32,7 @@
#include "qemu/error-report.h"
#include "qemu/accel.h"
#include "qapi/qapi-builtin-visit.h"
+#include "internal.h"
struct TCGState {
AccelState parent_obj;
@@ -109,8 +110,19 @@ static int tcg_init_machine(MachineState *ms)
{
TCGState *s = TCG_STATE(current_accel());
- tcg_exec_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
+ tcg_allowed = true;
mttcg_enabled = s->mttcg_enabled;
+
+ page_init();
+ tb_htable_init();
+ tcg_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
+
+#if defined(CONFIG_SOFTMMU)
+ /* There's no guest base to take into account, so go ahead and
+ initialize the prologue now. */
+ tcg_prologue_init(tcg_ctx);
+#endif
+
return 0;
}
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 050b4bff46..40aeecf611 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -408,7 +408,7 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t host_pc, bool will_exit)
return false;
}
-static void page_init(void)
+void page_init(void)
{
page_size_init();
page_table_config_init();
@@ -907,30 +907,13 @@ static bool tb_cmp(const void *ap, const void *bp)
a->page_addr[1] == b->page_addr[1];
}
-static void tb_htable_init(void)
+void tb_htable_init(void)
{
unsigned int mode = QHT_MODE_AUTO_RESIZE;
qht_init(&tb_ctx.htable, tb_cmp, CODE_GEN_HTABLE_SIZE, mode);
}
-/* Must be called before using the QEMU cpus. 'tb_size' is the size
- (in bytes) allocated to the translation buffer. Zero means default
- size. */
-void tcg_exec_init(unsigned long tb_size, int splitwx)
-{
- tcg_allowed = true;
- page_init();
- tb_htable_init();
- tcg_init(tb_size, splitwx);
-
-#if defined(CONFIG_SOFTMMU)
- /* There's no guest base to take into account, so go ahead and
- initialize the prologue now. */
- tcg_prologue_init(tcg_ctx);
-#endif
-}
-
/* call with @p->lock held */
static inline void invalidate_page_bitmap(PageDesc *p)
{
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 13/29] accel/tcg: Pass down max_cpus to tcg_init
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (11 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 12/29] accel/tcg: Merge tcg_exec_init into tcg_init_machine Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 14/29] tcg: Introduce tcg_max_ctxs Richard Henderson
` (17 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé
Start removing the include of hw/boards.h from tcg/.
Pass down the max_cpus value from tcg_init_machine,
where we have the MachineState already.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/tcg/tcg.h | 2 +-
tcg/internal.h | 2 +-
accel/tcg/tcg-all.c | 10 +++++++++-
tcg/region.c | 32 +++++++++++---------------------
tcg/tcg.c | 10 ++++------
5 files changed, 26 insertions(+), 30 deletions(-)
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 3ad77ec34d..a0122c0dd3 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -906,7 +906,7 @@ static inline void *tcg_malloc(int size)
}
}
-void tcg_init(size_t tb_size, int splitwx);
+void tcg_init(size_t tb_size, int splitwx, unsigned max_cpus);
void tcg_register_thread(void);
void tcg_prologue_init(TCGContext *s);
void tcg_func_start(TCGContext *s);
diff --git a/tcg/internal.h b/tcg/internal.h
index f13c564d9b..fcfeca232f 100644
--- a/tcg/internal.h
+++ b/tcg/internal.h
@@ -30,7 +30,7 @@
extern TCGContext **tcg_ctxs;
extern unsigned int n_tcg_ctxs;
-void tcg_region_init(size_t tb_size, int splitwx);
+void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus);
bool tcg_region_alloc(TCGContext *s);
void tcg_region_initial_alloc(TCGContext *s);
void tcg_region_prologue_set(TCGContext *s);
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index 0e83acbfe5..d2f2ddb844 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -32,6 +32,9 @@
#include "qemu/error-report.h"
#include "qemu/accel.h"
#include "qapi/qapi-builtin-visit.h"
+#if !defined(CONFIG_USER_ONLY)
+#include "hw/boards.h"
+#endif
#include "internal.h"
struct TCGState {
@@ -109,13 +112,18 @@ bool mttcg_enabled;
static int tcg_init_machine(MachineState *ms)
{
TCGState *s = TCG_STATE(current_accel());
+#ifdef CONFIG_USER_ONLY
+ unsigned max_cpus = 1;
+#else
+ unsigned max_cpus = ms->smp.max_cpus;
+#endif
tcg_allowed = true;
mttcg_enabled = s->mttcg_enabled;
page_init();
tb_htable_init();
- tcg_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
+ tcg_init(s->tb_size * 1024 * 1024, s->splitwx_enabled, max_cpus);
#if defined(CONFIG_SOFTMMU)
/* There's no guest base to take into account, so go ahead and
diff --git a/tcg/region.c b/tcg/region.c
index 8d88144a22..04b699da63 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -27,9 +27,6 @@
#include "qapi/error.h"
#include "exec/exec-all.h"
#include "tcg/tcg.h"
-#if !defined(CONFIG_USER_ONLY)
-#include "hw/boards.h"
-#endif
#include "internal.h"
@@ -366,27 +363,20 @@ void tcg_region_reset_all(void)
tcg_region_tree_reset_all();
}
+static size_t tcg_n_regions(unsigned max_cpus)
+{
#ifdef CONFIG_USER_ONLY
-static size_t tcg_n_regions(void)
-{
return 1;
-}
#else
-/*
- * It is likely that some vCPUs will translate more code than others, so we
- * first try to set more regions than max_cpus, with those regions being of
- * reasonable size. If that's not possible we make do by evenly dividing
- * the code_gen_buffer among the vCPUs.
- */
-static size_t tcg_n_regions(void)
-{
+ /*
+ * It is likely that some vCPUs will translate more code than others,
+ * so we first try to set more regions than max_cpus, with those regions
+ * being of reasonable size. If that's not possible we make do by evenly
+ * dividing the code_gen_buffer among the vCPUs.
+ */
size_t i;
/* Use a single region if all we have is one vCPU thread */
-#if !defined(CONFIG_USER_ONLY)
- MachineState *ms = MACHINE(qdev_get_machine());
- unsigned int max_cpus = ms->smp.max_cpus;
-#endif
if (max_cpus == 1 || !qemu_tcg_mttcg_enabled()) {
return 1;
}
@@ -405,8 +395,8 @@ static size_t tcg_n_regions(void)
}
/* If we can't, then just allocate one region per vCPU thread */
return max_cpus;
-}
#endif
+}
/* Minimum size of the code gen buffer. This number is randomly chosen,
but not so small that we can't have a fair number of TB's live. */
@@ -838,7 +828,7 @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
* in practice. Multi-threaded guests share most if not all of their translated
* code, which makes parallel code generation less appealing than in softmmu.
*/
-void tcg_region_init(size_t tb_size, int splitwx)
+void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
{
void *buf, *aligned;
size_t size;
@@ -856,7 +846,7 @@ void tcg_region_init(size_t tb_size, int splitwx)
buf = tcg_init_ctx.code_gen_buffer;
size = tcg_init_ctx.code_gen_buffer_size;
page_size = qemu_real_host_page_size;
- n_regions = tcg_n_regions();
+ n_regions = tcg_n_regions(max_cpus);
/* The first region will be 'aligned - buf' bytes larger than the others */
aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 65a63bda8a..a89d8f6b81 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -576,7 +576,7 @@ static void process_op_defs(TCGContext *s);
static TCGTemp *tcg_global_reg_new_internal(TCGContext *s, TCGType type,
TCGReg reg, const char *name);
-static void tcg_context_init(void)
+static void tcg_context_init(unsigned max_cpus)
{
TCGContext *s = &tcg_init_ctx;
int op, total_args, n, i;
@@ -645,8 +645,6 @@ static void tcg_context_init(void)
tcg_ctxs = &tcg_ctx;
n_tcg_ctxs = 1;
#else
- MachineState *ms = MACHINE(qdev_get_machine());
- unsigned int max_cpus = ms->smp.max_cpus;
tcg_ctxs = g_new(TCGContext *, max_cpus);
#endif
@@ -655,10 +653,10 @@ static void tcg_context_init(void)
cpu_env = temp_tcgv_ptr(ts);
}
-void tcg_init(size_t tb_size, int splitwx)
+void tcg_init(size_t tb_size, int splitwx, unsigned max_cpus)
{
- tcg_context_init();
- tcg_region_init(tb_size, splitwx);
+ tcg_context_init(max_cpus);
+ tcg_region_init(tb_size, splitwx, max_cpus);
}
/*
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 14/29] tcg: Introduce tcg_max_ctxs
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (12 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 13/29] accel/tcg: Pass down max_cpus to tcg_init Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 15/29] tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h Richard Henderson
` (16 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé
Finish the divorce of tcg/ from hw/, and do not take
the max cpu value from MachineState; just remember what
we were passed in tcg_init.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/internal.h | 3 ++-
tcg/region.c | 6 +++---
tcg/tcg.c | 23 ++++++++++-------------
3 files changed, 15 insertions(+), 17 deletions(-)
diff --git a/tcg/internal.h b/tcg/internal.h
index fcfeca232f..f9906523da 100644
--- a/tcg/internal.h
+++ b/tcg/internal.h
@@ -28,7 +28,8 @@
#define TCG_HIGHWATER 1024
extern TCGContext **tcg_ctxs;
-extern unsigned int n_tcg_ctxs;
+extern unsigned int tcg_cur_ctxs;
+extern unsigned int tcg_max_ctxs;
void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus);
bool tcg_region_alloc(TCGContext *s);
diff --git a/tcg/region.c b/tcg/region.c
index 04b699da63..e3fbf6a7e7 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -347,7 +347,7 @@ void tcg_region_initial_alloc(TCGContext *s)
/* Call from a safe-work context */
void tcg_region_reset_all(void)
{
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+ unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
unsigned int i;
qemu_mutex_lock(®ion.lock);
@@ -922,7 +922,7 @@ void tcg_region_prologue_set(TCGContext *s)
*/
size_t tcg_code_size(void)
{
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+ unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
unsigned int i;
size_t total;
@@ -958,7 +958,7 @@ size_t tcg_code_capacity(void)
size_t tcg_tb_phys_invalidate_count(void)
{
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+ unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
unsigned int i;
size_t total = 0;
diff --git a/tcg/tcg.c b/tcg/tcg.c
index a89d8f6b81..a82d3a0861 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -44,11 +44,6 @@
#include "cpu.h"
#include "exec/exec-all.h"
-
-#if !defined(CONFIG_USER_ONLY)
-#include "hw/boards.h"
-#endif
-
#include "tcg/tcg-op.h"
#if UINTPTR_MAX == UINT32_MAX
@@ -155,7 +150,8 @@ static int tcg_out_ldst_finalize(TCGContext *s);
#endif
TCGContext **tcg_ctxs;
-unsigned int n_tcg_ctxs;
+unsigned int tcg_cur_ctxs;
+unsigned int tcg_max_ctxs;
TCGv_env cpu_env = 0;
const void *tcg_code_gen_epilogue;
uintptr_t tcg_splitwx_diff;
@@ -475,7 +471,6 @@ void tcg_register_thread(void)
#else
void tcg_register_thread(void)
{
- MachineState *ms = MACHINE(qdev_get_machine());
TCGContext *s = g_malloc(sizeof(*s));
unsigned int i, n;
@@ -491,8 +486,8 @@ void tcg_register_thread(void)
}
/* Claim an entry in tcg_ctxs */
- n = qatomic_fetch_inc(&n_tcg_ctxs);
- g_assert(n < ms->smp.max_cpus);
+ n = qatomic_fetch_inc(&tcg_cur_ctxs);
+ g_assert(n < tcg_max_ctxs);
qatomic_set(&tcg_ctxs[n], s);
if (n > 0) {
@@ -643,9 +638,11 @@ static void tcg_context_init(unsigned max_cpus)
*/
#ifdef CONFIG_USER_ONLY
tcg_ctxs = &tcg_ctx;
- n_tcg_ctxs = 1;
+ tcg_cur_ctxs = 1;
+ tcg_max_ctxs = 1;
#else
- tcg_ctxs = g_new(TCGContext *, max_cpus);
+ tcg_max_ctxs = max_cpus;
+ tcg_ctxs = g_new0(TCGContext *, max_cpus);
#endif
tcg_debug_assert(!tcg_regset_test_reg(s->reserved_regs, TCG_AREG0));
@@ -3937,7 +3934,7 @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op)
static inline
void tcg_profile_snapshot(TCGProfile *prof, bool counters, bool table)
{
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+ unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
unsigned int i;
for (i = 0; i < n_ctxs; i++) {
@@ -4000,7 +3997,7 @@ void tcg_dump_op_count(void)
int64_t tcg_cpu_exec_time(void)
{
- unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+ unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
unsigned int i;
int64_t ret = 0;
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 15/29] tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (13 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 14/29] tcg: Introduce tcg_max_ctxs Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 16/29] tcg: Replace region.end with region.total_size Richard Henderson
` (15 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Remove the ifdef ladder and move each define into the
appropriate header file.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Retain comment about M_C_G_B_S constraint (balaton)
---
tcg/aarch64/tcg-target.h | 1 +
tcg/arm/tcg-target.h | 1 +
tcg/i386/tcg-target.h | 2 ++
tcg/mips/tcg-target.h | 6 ++++++
tcg/ppc/tcg-target.h | 2 ++
tcg/riscv/tcg-target.h | 1 +
tcg/s390/tcg-target.h | 3 +++
tcg/sparc/tcg-target.h | 1 +
tcg/tci/tcg-target.h | 1 +
tcg/region.c | 35 +++++++++--------------------------
10 files changed, 27 insertions(+), 26 deletions(-)
diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index 5ec30dba25..ef55f7c185 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -15,6 +15,7 @@
#define TCG_TARGET_INSN_UNIT_SIZE 4
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 24
+#define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
#undef TCG_TARGET_STACK_GROWSUP
typedef enum {
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index 8d1fee6327..b9a85d0f83 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -60,6 +60,7 @@ extern int arm_arch;
#undef TCG_TARGET_STACK_GROWSUP
#define TCG_TARGET_INSN_UNIT_SIZE 4
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
+#define MAX_CODE_GEN_BUFFER_SIZE UINT32_MAX
typedef enum {
TCG_REG_R0 = 0,
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index b693d3692d..ac10066c3e 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -31,9 +31,11 @@
#ifdef __x86_64__
# define TCG_TARGET_REG_BITS 64
# define TCG_TARGET_NB_REGS 32
+# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
#else
# define TCG_TARGET_REG_BITS 32
# define TCG_TARGET_NB_REGS 24
+# define MAX_CODE_GEN_BUFFER_SIZE UINT32_MAX
#endif
typedef enum {
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index c2c32fb38f..e81e824cab 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -39,6 +39,12 @@
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
#define TCG_TARGET_NB_REGS 32
+/*
+ * We have a 256MB branch region, but leave room to make sure the
+ * main executable is also within that region.
+ */
+#define MAX_CODE_GEN_BUFFER_SIZE (128 * MiB)
+
typedef enum {
TCG_REG_ZERO = 0,
TCG_REG_AT,
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index d1339afc66..c13ed5640a 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -27,8 +27,10 @@
#ifdef _ARCH_PPC64
# define TCG_TARGET_REG_BITS 64
+# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
#else
# define TCG_TARGET_REG_BITS 32
+# define MAX_CODE_GEN_BUFFER_SIZE (32 * MiB)
#endif
#define TCG_TARGET_NB_REGS 64
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index 727c8df418..87ea94666b 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -34,6 +34,7 @@
#define TCG_TARGET_INSN_UNIT_SIZE 4
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 20
#define TCG_TARGET_NB_REGS 32
+#define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
typedef enum {
TCG_REG_ZERO,
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 641464eea4..b04b72b7eb 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -28,6 +28,9 @@
#define TCG_TARGET_INSN_UNIT_SIZE 2
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 19
+/* We have a +- 4GB range on the branches; leave some slop. */
+#define MAX_CODE_GEN_BUFFER_SIZE (3 * GiB)
+
typedef enum TCGReg {
TCG_REG_R0 = 0,
TCG_REG_R1,
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
index f66f5d07dc..86bb9a2d39 100644
--- a/tcg/sparc/tcg-target.h
+++ b/tcg/sparc/tcg-target.h
@@ -30,6 +30,7 @@
#define TCG_TARGET_INSN_UNIT_SIZE 4
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 32
#define TCG_TARGET_NB_REGS 32
+#define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
typedef enum {
TCG_REG_G0 = 0,
diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
index 9c0021a26f..03cf527cb4 100644
--- a/tcg/tci/tcg-target.h
+++ b/tcg/tci/tcg-target.h
@@ -43,6 +43,7 @@
#define TCG_TARGET_INTERPRETER 1
#define TCG_TARGET_INSN_UNIT_SIZE 1
#define TCG_TARGET_TLB_DISPLACEMENT_BITS 32
+#define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
#if UINTPTR_MAX == UINT32_MAX
# define TCG_TARGET_REG_BITS 32
diff --git a/tcg/region.c b/tcg/region.c
index e3fbf6a7e7..ae22308290 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -398,34 +398,17 @@ static size_t tcg_n_regions(unsigned max_cpus)
#endif
}
-/* Minimum size of the code gen buffer. This number is randomly chosen,
- but not so small that we can't have a fair number of TB's live. */
+/*
+ * Minimum size of the code gen buffer. This number is randomly chosen,
+ * but not so small that we can't have a fair number of TB's live.
+ *
+ * Maximum size, MAX_CODE_GEN_BUFFER_SIZE, is defined in tcg-target.h.
+ * Unless otherwise indicated, this is constrained by the range of
+ * direct branches on the host cpu, as used by the TCG implementation
+ * of goto_tb.
+ */
#define MIN_CODE_GEN_BUFFER_SIZE (1 * MiB)
-/* Maximum size of the code gen buffer we'd like to use. Unless otherwise
- indicated, this is constrained by the range of direct branches on the
- host cpu, as used by the TCG implementation of goto_tb. */
-#if defined(__x86_64__)
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
-#elif defined(__sparc__)
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
-#elif defined(__powerpc64__)
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
-#elif defined(__powerpc__)
-# define MAX_CODE_GEN_BUFFER_SIZE (32 * MiB)
-#elif defined(__aarch64__)
-# define MAX_CODE_GEN_BUFFER_SIZE (2 * GiB)
-#elif defined(__s390x__)
- /* We have a +- 4GB range on the branches; leave some slop. */
-# define MAX_CODE_GEN_BUFFER_SIZE (3 * GiB)
-#elif defined(__mips__)
- /* We have a 256MB branch region, but leave room to make sure the
- main executable is also within that region. */
-# define MAX_CODE_GEN_BUFFER_SIZE (128 * MiB)
-#else
-# define MAX_CODE_GEN_BUFFER_SIZE ((size_t)-1)
-#endif
-
#if TCG_TARGET_REG_BITS == 32
#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (32 * MiB)
#ifdef CONFIG_USER_ONLY
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 16/29] tcg: Replace region.end with region.total_size
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (14 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 15/29] tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 17/29] tcg: Rename region.start to region.after_prologue Richard Henderson
` (14 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
A size is easier to work with than an end point,
particularly during initial buffer allocation.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/region.c | 29 +++++++++++++++++------------
1 file changed, 17 insertions(+), 12 deletions(-)
diff --git a/tcg/region.c b/tcg/region.c
index ae22308290..8e4dd0480b 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -48,7 +48,7 @@ struct tcg_region_state {
/* fields set at init time */
void *start;
void *start_aligned;
- void *end;
+ size_t total_size; /* size of entire buffer */
size_t n;
size_t size; /* size of one region */
size_t stride; /* .size + guard size */
@@ -279,7 +279,7 @@ static void tcg_region_bounds(size_t curr_region, void **pstart, void **pend)
start = region.start;
}
if (curr_region == region.n - 1) {
- end = region.end;
+ end = region.start_aligned + region.total_size;
}
*pstart = start;
@@ -813,8 +813,8 @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
*/
void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
{
- void *buf, *aligned;
- size_t size;
+ void *buf, *aligned, *end;
+ size_t total_size;
size_t page_size;
size_t region_size;
size_t n_regions;
@@ -827,19 +827,20 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
assert(ok);
buf = tcg_init_ctx.code_gen_buffer;
- size = tcg_init_ctx.code_gen_buffer_size;
+ total_size = tcg_init_ctx.code_gen_buffer_size;
page_size = qemu_real_host_page_size;
n_regions = tcg_n_regions(max_cpus);
/* The first region will be 'aligned - buf' bytes larger than the others */
aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
- g_assert(aligned < tcg_init_ctx.code_gen_buffer + size);
+ g_assert(aligned < tcg_init_ctx.code_gen_buffer + total_size);
+
/*
* Make region_size a multiple of page_size, using aligned as the start.
* As a result of this we might end up with a few extra pages at the end of
* the buffer; we will assign those to the last region.
*/
- region_size = (size - (aligned - buf)) / n_regions;
+ region_size = (total_size - (aligned - buf)) / n_regions;
region_size = QEMU_ALIGN_DOWN(region_size, page_size);
/* A region must have at least 2 pages; one code, one guard */
@@ -853,9 +854,11 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
region.start = buf;
region.start_aligned = aligned;
/* page-align the end, since its last page will be a guard page */
- region.end = QEMU_ALIGN_PTR_DOWN(buf + size, page_size);
+ end = QEMU_ALIGN_PTR_DOWN(buf + total_size, page_size);
/* account for that last guard page */
- region.end -= page_size;
+ end -= page_size;
+ total_size = end - aligned;
+ region.total_size = total_size;
/* set guard pages */
splitwx_diff = tcg_splitwx_diff;
@@ -893,7 +896,7 @@ void tcg_region_prologue_set(TCGContext *s)
/* Register the balance of the buffer with gdb. */
tcg_register_jit(tcg_splitwx_to_rx(region.start),
- region.end - region.start);
+ region.start_aligned + region.total_size - region.start);
}
/*
@@ -934,8 +937,10 @@ size_t tcg_code_capacity(void)
/* no need for synchronization; these variables are set at init time */
guard_size = region.stride - region.size;
- capacity = region.end + guard_size - region.start;
- capacity -= region.n * (guard_size + TCG_HIGHWATER);
+ capacity = region.total_size;
+ capacity -= (region.n - 1) * guard_size;
+ capacity -= region.n * TCG_HIGHWATER;
+
return capacity;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 17/29] tcg: Rename region.start to region.after_prologue
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (15 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 16/29] tcg: Replace region.end with region.total_size Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 18/29] tcg: Tidy tcg_n_regions Richard Henderson
` (13 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Give the field a name reflecting its actual meaning.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/region.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/tcg/region.c b/tcg/region.c
index 8e4dd0480b..23261561a1 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -46,8 +46,8 @@ struct tcg_region_state {
QemuMutex lock;
/* fields set at init time */
- void *start;
void *start_aligned;
+ void *after_prologue;
size_t total_size; /* size of entire buffer */
size_t n;
size_t size; /* size of one region */
@@ -276,7 +276,7 @@ static void tcg_region_bounds(size_t curr_region, void **pstart, void **pend)
end = start + region.size;
if (curr_region == 0) {
- start = region.start;
+ start = region.after_prologue;
}
if (curr_region == region.n - 1) {
end = region.start_aligned + region.total_size;
@@ -851,7 +851,7 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
region.n = n_regions;
region.size = region_size - page_size;
region.stride = region_size;
- region.start = buf;
+ region.after_prologue = buf;
region.start_aligned = aligned;
/* page-align the end, since its last page will be a guard page */
end = QEMU_ALIGN_PTR_DOWN(buf + total_size, page_size);
@@ -888,15 +888,16 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
void tcg_region_prologue_set(TCGContext *s)
{
/* Deduct the prologue from the first region. */
- g_assert(region.start == s->code_gen_buffer);
- region.start = s->code_ptr;
+ g_assert(region.start_aligned == s->code_gen_buffer);
+ region.after_prologue = s->code_ptr;
/* Recompute boundaries of the first region. */
tcg_region_assign(s, 0);
/* Register the balance of the buffer with gdb. */
- tcg_register_jit(tcg_splitwx_to_rx(region.start),
- region.start_aligned + region.total_size - region.start);
+ tcg_register_jit(tcg_splitwx_to_rx(region.after_prologue),
+ region.start_aligned + region.total_size -
+ region.after_prologue);
}
/*
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 18/29] tcg: Tidy tcg_n_regions
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (16 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 17/29] tcg: Rename region.start to region.after_prologue Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 19/29] tcg: Tidy split_cross_256mb Richard Henderson
` (12 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Compute the value using straight division and bounds,
rather than a loop. Pass in tb_size rather than reading
from tcg_init_ctx.code_gen_buffer_size,
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/region.c | 29 ++++++++++++-----------------
1 file changed, 12 insertions(+), 17 deletions(-)
diff --git a/tcg/region.c b/tcg/region.c
index 23261561a1..23b3459c61 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -363,38 +363,33 @@ void tcg_region_reset_all(void)
tcg_region_tree_reset_all();
}
-static size_t tcg_n_regions(unsigned max_cpus)
+static size_t tcg_n_regions(size_t tb_size, unsigned max_cpus)
{
#ifdef CONFIG_USER_ONLY
return 1;
#else
+ size_t n_regions;
+
/*
* It is likely that some vCPUs will translate more code than others,
* so we first try to set more regions than max_cpus, with those regions
* being of reasonable size. If that's not possible we make do by evenly
* dividing the code_gen_buffer among the vCPUs.
*/
- size_t i;
-
/* Use a single region if all we have is one vCPU thread */
if (max_cpus == 1 || !qemu_tcg_mttcg_enabled()) {
return 1;
}
- /* Try to have more regions than max_cpus, with each region being >= 2 MB */
- for (i = 8; i > 0; i--) {
- size_t regions_per_thread = i;
- size_t region_size;
-
- region_size = tcg_init_ctx.code_gen_buffer_size;
- region_size /= max_cpus * regions_per_thread;
-
- if (region_size >= 2 * 1024u * 1024) {
- return max_cpus * regions_per_thread;
- }
+ /*
+ * Try to have more regions than max_cpus, with each region being >= 2 MB.
+ * If we can't, then just allocate one region per vCPU thread.
+ */
+ n_regions = tb_size / (2 * MiB);
+ if (n_regions <= max_cpus) {
+ return max_cpus;
}
- /* If we can't, then just allocate one region per vCPU thread */
- return max_cpus;
+ return MIN(n_regions, max_cpus * 8);
#endif
}
@@ -829,7 +824,7 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
buf = tcg_init_ctx.code_gen_buffer;
total_size = tcg_init_ctx.code_gen_buffer_size;
page_size = qemu_real_host_page_size;
- n_regions = tcg_n_regions(max_cpus);
+ n_regions = tcg_n_regions(total_size, max_cpus);
/* The first region will be 'aligned - buf' bytes larger than the others */
aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 19/29] tcg: Tidy split_cross_256mb
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (17 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 18/29] tcg: Tidy tcg_n_regions Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 20/29] tcg: Move in_code_gen_buffer and tests to region.c Richard Henderson
` (11 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Return output buffer and size via output pointer arguments,
rather than returning size via tcg_ctx->code_gen_buffer_size.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/region.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/tcg/region.c b/tcg/region.c
index 23b3459c61..45c1178815 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -467,7 +467,8 @@ static inline bool cross_256mb(void *addr, size_t size)
/* We weren't able to allocate a buffer without crossing that boundary,
so make do with the larger portion of the buffer that doesn't cross.
Returns the new base of the buffer, and adjusts code_gen_buffer_size. */
-static inline void *split_cross_256mb(void *buf1, size_t size1)
+static inline void split_cross_256mb(void **obuf, size_t *osize,
+ void *buf1, size_t size1)
{
void *buf2 = (void *)(((uintptr_t)buf1 + size1) & ~0x0ffffffful);
size_t size2 = buf1 + size1 - buf2;
@@ -478,8 +479,8 @@ static inline void *split_cross_256mb(void *buf1, size_t size1)
buf1 = buf2;
}
- tcg_ctx->code_gen_buffer_size = size1;
- return buf1;
+ *obuf = buf1;
+ *osize = size1;
}
#endif
@@ -509,12 +510,10 @@ static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
if (size > tb_size) {
size = QEMU_ALIGN_DOWN(tb_size, qemu_real_host_page_size);
}
- tcg_ctx->code_gen_buffer_size = size;
#ifdef __mips__
if (cross_256mb(buf, size)) {
- buf = split_cross_256mb(buf, size);
- size = tcg_ctx->code_gen_buffer_size;
+ split_cross_256mb(&buf, &size, buf, size);
}
#endif
@@ -525,6 +524,7 @@ static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
tcg_ctx->code_gen_buffer = buf;
+ tcg_ctx->code_gen_buffer_size = size;
return true;
}
#elif defined(_WIN32)
@@ -583,8 +583,7 @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
/* fallthru */
default:
/* Split the original buffer. Free the smaller half. */
- buf2 = split_cross_256mb(buf, size);
- size2 = tcg_ctx->code_gen_buffer_size;
+ split_cross_256mb(&buf2, &size2, buf, size);
if (buf == buf2) {
munmap(buf + size2, size - size2);
} else {
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 20/29] tcg: Move in_code_gen_buffer and tests to region.c
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (18 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 19/29] tcg: Tidy split_cross_256mb Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 21/29] tcg: Allocate code_gen_buffer into struct tcg_region_state Richard Henderson
` (10 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Shortly, the full code_gen_buffer will only be visible
to region.c, so move in_code_gen_buffer out-of-line.
Move the debugging versions of tcg_splitwx_to_{rx,rw}
to region.c as well, so that the compiler gets to see
the implementation of in_code_gen_buffer.
This leaves exactly one use of in_code_gen_buffer outside
of region.c, in cpu_restore_state. Which, being on the
exception path, is not performance critical.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/tcg/tcg.h | 11 +----------
tcg/region.c | 34 ++++++++++++++++++++++++++++++++++
tcg/tcg.c | 23 -----------------------
3 files changed, 35 insertions(+), 33 deletions(-)
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index a0122c0dd3..a19deb529f 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -696,16 +696,7 @@ extern const void *tcg_code_gen_epilogue;
extern uintptr_t tcg_splitwx_diff;
extern TCGv_env cpu_env;
-static inline bool in_code_gen_buffer(const void *p)
-{
- const TCGContext *s = &tcg_init_ctx;
- /*
- * Much like it is valid to have a pointer to the byte past the
- * end of an array (so long as you don't dereference it), allow
- * a pointer to the byte past the end of the code gen buffer.
- */
- return (size_t)(p - s->code_gen_buffer) <= s->code_gen_buffer_size;
-}
+bool in_code_gen_buffer(const void *p);
#ifdef CONFIG_DEBUG_TCG
const void *tcg_splitwx_to_rx(void *rw);
diff --git a/tcg/region.c b/tcg/region.c
index 45c1178815..bf4167e467 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -68,6 +68,40 @@ static struct tcg_region_state region;
static void *region_trees;
static size_t tree_size;
+bool in_code_gen_buffer(const void *p)
+{
+ const TCGContext *s = &tcg_init_ctx;
+ /*
+ * Much like it is valid to have a pointer to the byte past the
+ * end of an array (so long as you don't dereference it), allow
+ * a pointer to the byte past the end of the code gen buffer.
+ */
+ return (size_t)(p - s->code_gen_buffer) <= s->code_gen_buffer_size;
+}
+
+#ifdef CONFIG_DEBUG_TCG
+const void *tcg_splitwx_to_rx(void *rw)
+{
+ /* Pass NULL pointers unchanged. */
+ if (rw) {
+ g_assert(in_code_gen_buffer(rw));
+ rw += tcg_splitwx_diff;
+ }
+ return rw;
+}
+
+void *tcg_splitwx_to_rw(const void *rx)
+{
+ /* Pass NULL pointers unchanged. */
+ if (rx) {
+ rx -= tcg_splitwx_diff;
+ /* Assert that we end with a pointer in the rw region. */
+ g_assert(in_code_gen_buffer(rx));
+ }
+ return (void *)rx;
+}
+#endif /* CONFIG_DEBUG_TCG */
+
/* compare a pointer @ptr and a tb_tc @s */
static int ptr_cmp_tb_tc(const void *ptr, const struct tb_tc *s)
{
diff --git a/tcg/tcg.c b/tcg/tcg.c
index a82d3a0861..65f9cf01d5 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -416,29 +416,6 @@ static const TCGTargetOpDef constraint_sets[] = {
#include "tcg-target.c.inc"
-#ifdef CONFIG_DEBUG_TCG
-const void *tcg_splitwx_to_rx(void *rw)
-{
- /* Pass NULL pointers unchanged. */
- if (rw) {
- g_assert(in_code_gen_buffer(rw));
- rw += tcg_splitwx_diff;
- }
- return rw;
-}
-
-void *tcg_splitwx_to_rw(const void *rx)
-{
- /* Pass NULL pointers unchanged. */
- if (rx) {
- rx -= tcg_splitwx_diff;
- /* Assert that we end with a pointer in the rw region. */
- g_assert(in_code_gen_buffer(rx));
- }
- return (void *)rx;
-}
-#endif /* CONFIG_DEBUG_TCG */
-
static void alloc_tcg_plugin_context(TCGContext *s)
{
#ifdef CONFIG_PLUGIN
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 21/29] tcg: Allocate code_gen_buffer into struct tcg_region_state
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (19 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 20/29] tcg: Move in_code_gen_buffer and tests to region.c Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 22/29] tcg: Return the map protection from alloc_code_gen_buffer Richard Henderson
` (9 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Do not mess around with setting values within tcg_init_ctx.
Put the values into 'region' directly, which is where they
will live for the lifetime of the program.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/region.c | 64 ++++++++++++++++++++++------------------------------
1 file changed, 27 insertions(+), 37 deletions(-)
diff --git a/tcg/region.c b/tcg/region.c
index bf4167e467..9a2b014838 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -70,13 +70,12 @@ static size_t tree_size;
bool in_code_gen_buffer(const void *p)
{
- const TCGContext *s = &tcg_init_ctx;
/*
* Much like it is valid to have a pointer to the byte past the
* end of an array (so long as you don't dereference it), allow
* a pointer to the byte past the end of the code gen buffer.
*/
- return (size_t)(p - s->code_gen_buffer) <= s->code_gen_buffer_size;
+ return (size_t)(p - region.start_aligned) <= region.total_size;
}
#ifdef CONFIG_DEBUG_TCG
@@ -557,8 +556,8 @@ static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
}
qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
- tcg_ctx->code_gen_buffer = buf;
- tcg_ctx->code_gen_buffer_size = size;
+ region.start_aligned = buf;
+ region.total_size = size;
return true;
}
#elif defined(_WIN32)
@@ -579,8 +578,8 @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
return false;
}
- tcg_ctx->code_gen_buffer = buf;
- tcg_ctx->code_gen_buffer_size = size;
+ region.start_aligned = buf;
+ region.total_size = size;
return true;
}
#else
@@ -595,7 +594,6 @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
"allocate %zu bytes for jit buffer", size);
return false;
}
- tcg_ctx->code_gen_buffer_size = size;
#ifdef __mips__
if (cross_256mb(buf, size)) {
@@ -633,7 +631,8 @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
/* Request large pages for the buffer. */
qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
- tcg_ctx->code_gen_buffer = buf;
+ region.start_aligned = buf;
+ region.total_size = size;
return true;
}
@@ -654,8 +653,8 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
return false;
}
/* The size of the mapping may have been adjusted. */
- size = tcg_ctx->code_gen_buffer_size;
- buf_rx = tcg_ctx->code_gen_buffer;
+ buf_rx = region.start_aligned;
+ size = region.total_size;
#endif
buf_rw = qemu_memfd_alloc("tcg-jit", size, 0, &fd, errp);
@@ -677,8 +676,8 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
#endif
close(fd);
- tcg_ctx->code_gen_buffer = buf_rw;
- tcg_ctx->code_gen_buffer_size = size;
+ region.start_aligned = buf_rw;
+ region.total_size = size;
tcg_splitwx_diff = buf_rx - buf_rw;
/* Request large pages for the buffer and the splitwx. */
@@ -729,7 +728,7 @@ static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
return false;
}
- buf_rw = (mach_vm_address_t)tcg_ctx->code_gen_buffer;
+ buf_rw = region.start_aligned;
buf_rx = 0;
ret = mach_vm_remap(mach_task_self(),
&buf_rx,
@@ -841,11 +840,8 @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
*/
void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
{
- void *buf, *aligned, *end;
- size_t total_size;
size_t page_size;
size_t region_size;
- size_t n_regions;
size_t i;
uintptr_t splitwx_diff;
bool ok;
@@ -854,39 +850,33 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
splitwx, &error_fatal);
assert(ok);
- buf = tcg_init_ctx.code_gen_buffer;
- total_size = tcg_init_ctx.code_gen_buffer_size;
- page_size = qemu_real_host_page_size;
- n_regions = tcg_n_regions(total_size, max_cpus);
-
- /* The first region will be 'aligned - buf' bytes larger than the others */
- aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
- g_assert(aligned < tcg_init_ctx.code_gen_buffer + total_size);
-
/*
* Make region_size a multiple of page_size, using aligned as the start.
* As a result of this we might end up with a few extra pages at the end of
* the buffer; we will assign those to the last region.
*/
- region_size = (total_size - (aligned - buf)) / n_regions;
+ region.n = tcg_n_regions(region.total_size, max_cpus);
+ page_size = qemu_real_host_page_size;
+ region_size = region.total_size / region.n;
region_size = QEMU_ALIGN_DOWN(region_size, page_size);
/* A region must have at least 2 pages; one code, one guard */
g_assert(region_size >= 2 * page_size);
+ region.stride = region_size;
+
+ /* Reserve space for guard pages. */
+ region.size = region_size - page_size;
+ region.total_size -= page_size;
+
+ /*
+ * The first region will be smaller than the others, via the prologue,
+ * which has yet to be allocated. For now, the first region begins at
+ * the page boundary.
+ */
+ region.after_prologue = region.start_aligned;
/* init the region struct */
qemu_mutex_init(®ion.lock);
- region.n = n_regions;
- region.size = region_size - page_size;
- region.stride = region_size;
- region.after_prologue = buf;
- region.start_aligned = aligned;
- /* page-align the end, since its last page will be a guard page */
- end = QEMU_ALIGN_PTR_DOWN(buf + total_size, page_size);
- /* account for that last guard page */
- end -= page_size;
- total_size = end - aligned;
- region.total_size = total_size;
/* set guard pages */
splitwx_diff = tcg_splitwx_diff;
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 22/29] tcg: Return the map protection from alloc_code_gen_buffer
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (20 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 21/29] tcg: Allocate code_gen_buffer into struct tcg_region_state Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 22:04 ` Philippe Mathieu-Daudé
2021-03-14 21:27 ` [PATCH v2 23/29] tcg: Sink qemu_madvise call to common code Richard Henderson
` (8 subsequent siblings)
30 siblings, 1 reply; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Change the interface from a boolean error indication to a
negative error vs a non-negative protection. For the moment
this is only interface change, not making use of the new data.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/region.c | 63 +++++++++++++++++++++++++++-------------------------
1 file changed, 33 insertions(+), 30 deletions(-)
diff --git a/tcg/region.c b/tcg/region.c
index 9a2b014838..3ca0d01fa4 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -521,14 +521,14 @@ static inline void split_cross_256mb(void **obuf, size_t *osize,
static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
__attribute__((aligned(CODE_GEN_ALIGN)));
-static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
+static int alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
{
void *buf, *end;
size_t size;
if (splitwx > 0) {
error_setg(errp, "jit split-wx not supported");
- return false;
+ return -1;
}
/* page-align the beginning and end of the buffer */
@@ -558,16 +558,17 @@ static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
region.start_aligned = buf;
region.total_size = size;
- return true;
+
+ return PROT_READ | PROT_WRITE;
}
#elif defined(_WIN32)
-static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
+static int alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
{
void *buf;
if (splitwx > 0) {
error_setg(errp, "jit split-wx not supported");
- return false;
+ return -1;
}
buf = VirtualAlloc(NULL, size, MEM_RESERVE | MEM_COMMIT,
@@ -580,11 +581,12 @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
region.start_aligned = buf;
region.total_size = size;
- return true;
+
+ return PAGE_READ | PAGE_WRITE | PAGE_EXEC;
}
#else
-static bool alloc_code_gen_buffer_anon(size_t size, int prot,
- int flags, Error **errp)
+static int alloc_code_gen_buffer_anon(size_t size, int prot,
+ int flags, Error **errp)
{
void *buf;
@@ -592,7 +594,7 @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
if (buf == MAP_FAILED) {
error_setg_errno(errp, errno,
"allocate %zu bytes for jit buffer", size);
- return false;
+ return -1;
}
#ifdef __mips__
@@ -633,7 +635,7 @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
region.start_aligned = buf;
region.total_size = size;
- return true;
+ return prot;
}
#ifndef CONFIG_TCG_INTERPRETER
@@ -647,9 +649,9 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
#ifdef __mips__
/* Find space for the RX mapping, vs the 256MiB regions. */
- if (!alloc_code_gen_buffer_anon(size, PROT_NONE,
- MAP_PRIVATE | MAP_ANONYMOUS |
- MAP_NORESERVE, errp)) {
+ if (alloc_code_gen_buffer_anon(size, PROT_NONE,
+ MAP_PRIVATE | MAP_ANONYMOUS |
+ MAP_NORESERVE, errp) < 0) {
return false;
}
/* The size of the mapping may have been adjusted. */
@@ -683,7 +685,7 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
/* Request large pages for the buffer and the splitwx. */
qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
- return true;
+ return PROT_READ | PROT_WRITE;
fail_rx:
error_setg_errno(errp, errno, "failed to map shared memory for execute");
@@ -697,7 +699,7 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
if (fd >= 0) {
close(fd);
}
- return false;
+ return -1;
}
#endif /* CONFIG_POSIX */
@@ -716,7 +718,7 @@ extern kern_return_t mach_vm_remap(vm_map_t target_task,
vm_prot_t *max_protection,
vm_inherit_t inheritance);
-static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
+static int alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
{
kern_return_t ret;
mach_vm_address_t buf_rw, buf_rx;
@@ -725,7 +727,7 @@ static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
/* Map the read-write portion via normal anon memory. */
if (!alloc_code_gen_buffer_anon(size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, errp)) {
- return false;
+ return -1;
}
buf_rw = region.start_aligned;
@@ -745,23 +747,23 @@ static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
/* TODO: Convert "ret" to a human readable error message. */
error_setg(errp, "vm_remap for jit splitwx failed");
munmap((void *)buf_rw, size);
- return false;
+ return -1;
}
if (mprotect((void *)buf_rx, size, PROT_READ | PROT_EXEC) != 0) {
error_setg_errno(errp, errno, "mprotect for jit splitwx");
munmap((void *)buf_rx, size);
munmap((void *)buf_rw, size);
- return false;
+ return -1;
}
tcg_splitwx_diff = buf_rx - buf_rw;
- return true;
+ return PROT_READ | PROT_WRITE;
}
#endif /* CONFIG_DARWIN */
#endif /* CONFIG_TCG_INTERPRETER */
-static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
+static int alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
{
#ifndef CONFIG_TCG_INTERPRETER
# ifdef CONFIG_DARWIN
@@ -772,24 +774,25 @@ static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
# endif
#endif
error_setg(errp, "jit split-wx not supported");
- return false;
+ return -1;
}
-static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
+static int alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
{
ERRP_GUARD();
int prot, flags;
if (splitwx) {
- if (alloc_code_gen_buffer_splitwx(size, errp)) {
- return true;
+ prot = alloc_code_gen_buffer_splitwx(size, errp);
+ if (prot >= 0) {
+ return prot;
}
/*
* If splitwx force-on (1), fail;
* if splitwx default-on (-1), fall through to splitwx off.
*/
if (splitwx > 0) {
- return false;
+ return -1;
}
error_free_or_abort(errp);
}
@@ -844,11 +847,11 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
size_t region_size;
size_t i;
uintptr_t splitwx_diff;
- bool ok;
+ int have_prot;
- ok = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
- splitwx, &error_fatal);
- assert(ok);
+ have_prot = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
+ splitwx, &error_fatal);
+ assert(have_prot >= 0);
/*
* Make region_size a multiple of page_size, using aligned as the start.
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 23/29] tcg: Sink qemu_madvise call to common code
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (21 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 22/29] tcg: Return the map protection from alloc_code_gen_buffer Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 24/29] tcg: Do not set guard pages in the rx buffer Richard Henderson
` (7 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Move the call out of the N versions of alloc_code_gen_buffer
and into tcg_region_init.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/region.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/tcg/region.c b/tcg/region.c
index 3ca0d01fa4..994c083343 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -554,7 +554,6 @@ static int alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
error_setg_errno(errp, errno, "mprotect of jit buffer");
return false;
}
- qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
region.start_aligned = buf;
region.total_size = size;
@@ -630,9 +629,6 @@ static int alloc_code_gen_buffer_anon(size_t size, int prot,
}
#endif
- /* Request large pages for the buffer. */
- qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
-
region.start_aligned = buf;
region.total_size = size;
return prot;
@@ -682,9 +678,6 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
region.total_size = size;
tcg_splitwx_diff = buf_rx - buf_rw;
- /* Request large pages for the buffer and the splitwx. */
- qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
- qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
return PROT_READ | PROT_WRITE;
fail_rx:
@@ -853,6 +846,13 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
splitwx, &error_fatal);
assert(have_prot >= 0);
+ /* Request large pages for the buffer and the splitwx. */
+ qemu_madvise(region.start_aligned, region.total_size, QEMU_MADV_HUGEPAGE);
+ if (tcg_splitwx_diff) {
+ qemu_madvise(region.start_aligned + tcg_splitwx_diff,
+ region.total_size, QEMU_MADV_HUGEPAGE);
+ }
+
/*
* Make region_size a multiple of page_size, using aligned as the start.
* As a result of this we might end up with a few extra pages at the end of
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 24/29] tcg: Do not set guard pages in the rx buffer
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (22 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 23/29] tcg: Sink qemu_madvise call to common code Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 25/29] util/osdep: Add qemu_mprotect_rw Richard Henderson
` (6 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
We only need guard pages in the rw buffer to avoid buffer overruns.
Let the rx buffer keep large pages all the way through.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/region.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/tcg/region.c b/tcg/region.c
index 994c083343..27a7e35c8e 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -839,7 +839,6 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
size_t page_size;
size_t region_size;
size_t i;
- uintptr_t splitwx_diff;
int have_prot;
have_prot = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
@@ -881,8 +880,7 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
/* init the region struct */
qemu_mutex_init(®ion.lock);
- /* set guard pages */
- splitwx_diff = tcg_splitwx_diff;
+ /* Set guard pages. No need to do this for the rx_buf, only the rw_buf. */
for (i = 0; i < region.n; i++) {
void *start, *end;
int rc;
@@ -890,10 +888,6 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
tcg_region_bounds(i, &start, &end);
rc = qemu_mprotect_none(end, page_size);
g_assert(!rc);
- if (splitwx_diff) {
- rc = qemu_mprotect_none(end + splitwx_diff, page_size);
- g_assert(!rc);
- }
}
tcg_region_trees_init();
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 25/29] util/osdep: Add qemu_mprotect_rw
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (23 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 24/29] tcg: Do not set guard pages in the rx buffer Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 26/29] tcg: Round the tb_size default from qemu_get_host_physmem Richard Henderson
` (5 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé
For --enable-tcg-interpreter on Windows, we will need this.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/qemu/osdep.h | 1 +
util/osdep.c | 9 +++++++++
2 files changed, 10 insertions(+)
diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index ba15be9c56..5cc2e57bdf 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -494,6 +494,7 @@ void sigaction_invoke(struct sigaction *action,
#endif
int qemu_madvise(void *addr, size_t len, int advice);
+int qemu_mprotect_rw(void *addr, size_t size);
int qemu_mprotect_rwx(void *addr, size_t size);
int qemu_mprotect_none(void *addr, size_t size);
diff --git a/util/osdep.c b/util/osdep.c
index 66d01b9160..42a0a4986a 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -97,6 +97,15 @@ static int qemu_mprotect__osdep(void *addr, size_t size, int prot)
#endif
}
+int qemu_mprotect_rw(void *addr, size_t size)
+{
+#ifdef _WIN32
+ return qemu_mprotect__osdep(addr, size, PAGE_READWRITE);
+#else
+ return qemu_mprotect__osdep(addr, size, PROT_READ | PROT_WRITE);
+#endif
+}
+
int qemu_mprotect_rwx(void *addr, size_t size)
{
#ifdef _WIN32
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 26/29] tcg: Round the tb_size default from qemu_get_host_physmem
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (24 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 25/29] util/osdep: Add qemu_mprotect_rw Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 27/29] tcg: Merge buffer protection and guard page protection Richard Henderson
` (4 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
If qemu_get_host_physmem returns an odd number of pages,
then physmem / 8 will not be a multiple of the page size.
The following was observed on a gitlab runner:
ERROR qtest-arm/boot-serial-test - Bail out!
ERROR:../util/osdep.c:80:qemu_mprotect__osdep: \
assertion failed: (!(size & ~qemu_real_host_page_mask))
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/region.c | 47 +++++++++++++++++++++--------------------------
1 file changed, 21 insertions(+), 26 deletions(-)
diff --git a/tcg/region.c b/tcg/region.c
index 27a7e35c8e..4dc1237ff4 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -469,26 +469,6 @@ static size_t tcg_n_regions(size_t tb_size, unsigned max_cpus)
(DEFAULT_CODE_GEN_BUFFER_SIZE_1 < MAX_CODE_GEN_BUFFER_SIZE \
? DEFAULT_CODE_GEN_BUFFER_SIZE_1 : MAX_CODE_GEN_BUFFER_SIZE)
-static size_t size_code_gen_buffer(size_t tb_size)
-{
- /* Size the buffer. */
- if (tb_size == 0) {
- size_t phys_mem = qemu_get_host_physmem();
- if (phys_mem == 0) {
- tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
- } else {
- tb_size = MIN(DEFAULT_CODE_GEN_BUFFER_SIZE, phys_mem / 8);
- }
- }
- if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
- tb_size = MIN_CODE_GEN_BUFFER_SIZE;
- }
- if (tb_size > MAX_CODE_GEN_BUFFER_SIZE) {
- tb_size = MAX_CODE_GEN_BUFFER_SIZE;
- }
- return tb_size;
-}
-
#ifdef __mips__
/* In order to use J and JAL within the code_gen_buffer, we require
that the buffer not cross a 256MB boundary. */
@@ -836,13 +816,29 @@ static int alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
*/
void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
{
- size_t page_size;
+ const size_t page_size = qemu_real_host_page_size;
size_t region_size;
size_t i;
int have_prot;
- have_prot = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
- splitwx, &error_fatal);
+ /* Size the buffer. */
+ if (tb_size == 0) {
+ size_t phys_mem = qemu_get_host_physmem();
+ if (phys_mem == 0) {
+ tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
+ } else {
+ tb_size = QEMU_ALIGN_DOWN(phys_mem / 8, page_size);
+ tb_size = MIN(DEFAULT_CODE_GEN_BUFFER_SIZE, tb_size);
+ }
+ }
+ if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
+ tb_size = MIN_CODE_GEN_BUFFER_SIZE;
+ }
+ if (tb_size > MAX_CODE_GEN_BUFFER_SIZE) {
+ tb_size = MAX_CODE_GEN_BUFFER_SIZE;
+ }
+
+ have_prot = alloc_code_gen_buffer(tb_size, splitwx, &error_fatal);
assert(have_prot >= 0);
/* Request large pages for the buffer and the splitwx. */
@@ -857,9 +853,8 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
* As a result of this we might end up with a few extra pages at the end of
* the buffer; we will assign those to the last region.
*/
- region.n = tcg_n_regions(region.total_size, max_cpus);
- page_size = qemu_real_host_page_size;
- region_size = region.total_size / region.n;
+ region.n = tcg_n_regions(tb_size, max_cpus);
+ region_size = tb_size / region.n;
region_size = QEMU_ALIGN_DOWN(region_size, page_size);
/* A region must have at least 2 pages; one code, one guard */
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 27/29] tcg: Merge buffer protection and guard page protection
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (25 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 26/29] tcg: Round the tb_size default from qemu_get_host_physmem Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 28/29] tcg: When allocating for !splitwx, begin with PROT_NONE Richard Henderson
` (3 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
Do not handle protections on a case-by-case basis in the
various alloc_code_gen_buffer instances; do it within a
single loop in tcg_region_init.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/region.c | 40 +++++++++++++++++++++++++++++-----------
1 file changed, 29 insertions(+), 11 deletions(-)
diff --git a/tcg/region.c b/tcg/region.c
index 4dc1237ff4..fac416ebf5 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -530,11 +530,6 @@ static int alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
}
#endif
- if (qemu_mprotect_rwx(buf, size)) {
- error_setg_errno(errp, errno, "mprotect of jit buffer");
- return false;
- }
-
region.start_aligned = buf;
region.total_size = size;
@@ -818,8 +813,7 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
{
const size_t page_size = qemu_real_host_page_size;
size_t region_size;
- size_t i;
- int have_prot;
+ int have_prot, need_prot;
/* Size the buffer. */
if (tb_size == 0) {
@@ -875,14 +869,38 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
/* init the region struct */
qemu_mutex_init(®ion.lock);
- /* Set guard pages. No need to do this for the rx_buf, only the rw_buf. */
- for (i = 0; i < region.n; i++) {
+ /*
+ * Set guard pages. No need to do this for the rx_buf, only the rw_buf.
+ * Work with the page protections set up with the initial mapping.
+ */
+ need_prot = PAGE_READ | PAGE_WRITE;
+#ifndef CONFIG_TCG_INTERPRETER
+ if (tcg_splitwx_diff == 0) {
+ need_prot |= PAGE_EXEC;
+ }
+#endif
+ for (size_t i = 0, n = region.n; i < n; i++) {
void *start, *end;
int rc;
tcg_region_bounds(i, &start, &end);
- rc = qemu_mprotect_none(end, page_size);
- g_assert(!rc);
+ if (have_prot != need_prot) {
+ if (need_prot == (PAGE_READ | PAGE_WRITE | PAGE_EXEC)) {
+ rc = qemu_mprotect_rwx(start, end - start);
+ } else if (need_prot == (PAGE_READ | PAGE_WRITE)) {
+ rc = qemu_mprotect_rw(start, end - start);
+ } else {
+ g_assert_not_reached();
+ }
+ if (rc) {
+ error_setg_errno(&error_fatal, errno,
+ "mprotect of jit buffer");
+ }
+ }
+ if (have_prot != 0) {
+ /* If guard-page permissions don't change, it isn't fatal. */
+ (void)qemu_mprotect_none(end, page_size);
+ }
}
tcg_region_trees_init();
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 28/29] tcg: When allocating for !splitwx, begin with PROT_NONE
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (26 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 27/29] tcg: Merge buffer protection and guard page protection Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 21:27 ` [PATCH v2 29/29] tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/ Richard Henderson
` (2 subsequent siblings)
30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j
There's a change in mprotect() behaviour [1] in the latest macOS
on M1 and it's not yet clear if it's going to be fixed by Apple.
In this case, instead of changing permissions of N guard pages,
we change permissions of N rwx regions. The same number of
syscalls are required either way.
[1] https://gist.github.com/hikalium/75ae822466ee4da13cbbe486498a191f
Buglink: https://bugs.launchpad.net/qemu/+bug/1914849
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
tcg/region.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/tcg/region.c b/tcg/region.c
index fac416ebf5..53f78965c7 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -765,12 +765,15 @@ static int alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
error_free_or_abort(errp);
}
- prot = PROT_READ | PROT_WRITE | PROT_EXEC;
+ /*
+ * macOS 11.2 has a bug (Apple Feedback FB8994773) in which mprotect
+ * rejects a permission change from RWX -> NONE when reserving the
+ * guard pages later. We can go the other way with the same number
+ * of syscalls, so always begin with PROT_NONE.
+ */
+ prot = PROT_NONE;
flags = MAP_PRIVATE | MAP_ANONYMOUS;
-#ifdef CONFIG_TCG_INTERPRETER
- /* The tcg interpreter does not need execute permission. */
- prot = PROT_READ | PROT_WRITE;
-#elif defined(CONFIG_DARWIN)
+#ifdef CONFIG_DARWIN
/* Applicable to both iOS and macOS (Apple Silicon). */
if (!splitwx) {
flags |= MAP_JIT;
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* [PATCH v2 29/29] tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (27 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 28/29] tcg: When allocating for !splitwx, begin with PROT_NONE Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
2021-03-14 22:00 ` Philippe Mathieu-Daudé
2021-03-14 22:12 ` [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug no-reply
2021-03-15 23:08 ` Roman Bolshakov
30 siblings, 1 reply; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé
These variables belong to the jit side, not the user side.
Since tcg_init_ctx is no longer used outside of tcg/, move
the declaration to tcg/internal.h.
Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/tcg/tcg.h | 1 -
tcg/internal.h | 1 +
accel/tcg/translate-all.c | 3 ---
tcg/tcg.c | 3 +++
4 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index a19deb529f..eef8857cca 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -690,7 +690,6 @@ static inline bool temp_readonly(TCGTemp *ts)
return ts->kind >= TEMP_FIXED;
}
-extern TCGContext tcg_init_ctx;
extern __thread TCGContext *tcg_ctx;
extern const void *tcg_code_gen_epilogue;
extern uintptr_t tcg_splitwx_diff;
diff --git a/tcg/internal.h b/tcg/internal.h
index f9906523da..181f86507a 100644
--- a/tcg/internal.h
+++ b/tcg/internal.h
@@ -27,6 +27,7 @@
#define TCG_HIGHWATER 1024
+extern TCGContext tcg_init_ctx;
extern TCGContext **tcg_ctxs;
extern unsigned int tcg_cur_ctxs;
extern unsigned int tcg_max_ctxs;
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 40aeecf611..b32760c253 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -218,9 +218,6 @@ static int v_l2_levels;
static void *l1_map[V_L1_MAX_SIZE];
-/* code generation context */
-TCGContext tcg_init_ctx;
-__thread TCGContext *tcg_ctx;
TBContext tb_ctx;
static void page_table_config_init(void)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 65f9cf01d5..77335fb60f 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -149,6 +149,9 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type,
static int tcg_out_ldst_finalize(TCGContext *s);
#endif
+TCGContext tcg_init_ctx;
+__thread TCGContext *tcg_ctx;
+
TCGContext **tcg_ctxs;
unsigned int tcg_cur_ctxs;
unsigned int tcg_max_ctxs;
--
2.25.1
^ permalink raw reply related [flat|nested] 38+ messages in thread
* Re: [PATCH v2 29/29] tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/
2021-03-14 21:27 ` [PATCH v2 29/29] tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/ Richard Henderson
@ 2021-03-14 22:00 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 38+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-03-14 22:00 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: r.bolshakov, j
On 3/14/21 10:27 PM, Richard Henderson wrote:
> These variables belong to the jit side, not the user side.
>
> Since tcg_init_ctx is no longer used outside of tcg/, move
> the declaration to tcg/internal.h.
>
> Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> include/tcg/tcg.h | 1 -
> tcg/internal.h | 1 +
> accel/tcg/translate-all.c | 3 ---
> tcg/tcg.c | 3 +++
> 4 files changed, 4 insertions(+), 4 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 22/29] tcg: Return the map protection from alloc_code_gen_buffer
2021-03-14 21:27 ` [PATCH v2 22/29] tcg: Return the map protection from alloc_code_gen_buffer Richard Henderson
@ 2021-03-14 22:04 ` Philippe Mathieu-Daudé
0 siblings, 0 replies; 38+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-03-14 22:04 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: r.bolshakov, j
On 3/14/21 10:27 PM, Richard Henderson wrote:
> Change the interface from a boolean error indication to a
> negative error vs a non-negative protection. For the moment
> this is only interface change, not making use of the new data.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> tcg/region.c | 63 +++++++++++++++++++++++++++-------------------------
> 1 file changed, 33 insertions(+), 30 deletions(-)
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (28 preceding siblings ...)
2021-03-14 21:27 ` [PATCH v2 29/29] tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/ Richard Henderson
@ 2021-03-14 22:12 ` no-reply
2021-03-15 23:08 ` Roman Bolshakov
30 siblings, 0 replies; 38+ messages in thread
From: no-reply @ 2021-03-14 22:12 UTC (permalink / raw)
To: richard.henderson; +Cc: r.bolshakov, qemu-devel, j
Patchew URL: https://patchew.org/QEMU/20210314212724.1917075-1-richard.henderson@linaro.org/
Hi,
This series seems to have some coding style problems. See output below for
more information:
Type: series
Message-id: 20210314212724.1917075-1-richard.henderson@linaro.org
Subject: [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug
=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===
Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
- [tag update] patchew/20210314163927.1184-1-peter.maydell@linaro.org -> patchew/20210314163927.1184-1-peter.maydell@linaro.org
* [new tag] patchew/20210314212724.1917075-1-richard.henderson@linaro.org -> patchew/20210314212724.1917075-1-richard.henderson@linaro.org
Switched to a new branch 'test'
9906c07 tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/
ef1e2c0 tcg: When allocating for !splitwx, begin with PROT_NONE
76e12ad tcg: Merge buffer protection and guard page protection
1e9c899 tcg: Round the tb_size default from qemu_get_host_physmem
4bdfded util/osdep: Add qemu_mprotect_rw
a1751c5 tcg: Do not set guard pages in the rx buffer
40483ad tcg: Sink qemu_madvise call to common code
856c724 tcg: Return the map protection from alloc_code_gen_buffer
7622097 tcg: Allocate code_gen_buffer into struct tcg_region_state
251d71e tcg: Move in_code_gen_buffer and tests to region.c
a6a064d tcg: Tidy split_cross_256mb
af03a0d tcg: Tidy tcg_n_regions
218436d tcg: Rename region.start to region.after_prologue
9f3981e tcg: Replace region.end with region.total_size
276ecb9 tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h
683f5af tcg: Introduce tcg_max_ctxs
d7bf2f6 accel/tcg: Pass down max_cpus to tcg_init
a1cd412 accel/tcg: Merge tcg_exec_init into tcg_init_machine
4940162 tcg: Create tcg_init
4ab59ad accel/tcg: Rename tcg_init to tcg_init_machine
e27bd38 accel/tcg: Move alloc_code_gen_buffer to tcg/region.c
d4c3608 accel/tcg: Inline cpu_gen_init
2245d5c tcg: Split out region.c
a284234 tcg: Split out tcg_region_prologue_set
d116828 tcg: Split out tcg_region_initial_alloc
c75ce79 tcg: Remove error return from tcg_region_initial_alloc__locked
0df4d6c tcg: Re-order tcg_region_init vs tcg_prologue_init
cc0f7f7 meson: Split out fpu/meson.build
b0a2113 meson: Split out tcg/meson.build
=== OUTPUT BEGIN ===
1/29 Checking commit b0a211318ba3 (meson: Split out tcg/meson.build)
Use of uninitialized value $acpi_testexpected in string eq at ./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#44:
new file mode 100644
total: 0 errors, 1 warnings, 35 lines checked
Patch 1/29 has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
2/29 Checking commit cc0f7f7fdc1a (meson: Split out fpu/meson.build)
Use of uninitialized value $acpi_testexpected in string eq at ./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#16:
new file mode 100644
total: 0 errors, 1 warnings, 17 lines checked
Patch 2/29 has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
3/29 Checking commit 0df4d6c7ee67 (tcg: Re-order tcg_region_init vs tcg_prologue_init)
4/29 Checking commit c75ce79d8926 (tcg: Remove error return from tcg_region_initial_alloc__locked)
5/29 Checking commit d116828491cc (tcg: Split out tcg_region_initial_alloc)
6/29 Checking commit a284234d3909 (tcg: Split out tcg_region_prologue_set)
7/29 Checking commit 2245d5c83ec4 (tcg: Split out region.c)
Use of uninitialized value $acpi_testexpected in string eq at ./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#17:
new file mode 100644
total: 0 errors, 1 warnings, 1189 lines checked
Patch 7/29 has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
8/29 Checking commit d4c36080e021 (accel/tcg: Inline cpu_gen_init)
9/29 Checking commit e27bd38f652c (accel/tcg: Move alloc_code_gen_buffer to tcg/region.c)
WARNING: Block comments use a leading /* on a separate line
#499: FILE: tcg/region.c:411:
+/* Minimum size of the code gen buffer. This number is randomly chosen,
WARNING: Block comments use * on subsequent lines
#500: FILE: tcg/region.c:412:
+/* Minimum size of the code gen buffer. This number is randomly chosen,
+ but not so small that we can't have a fair number of TB's live. */
WARNING: Block comments use a trailing */ on a separate line
#500: FILE: tcg/region.c:412:
+ but not so small that we can't have a fair number of TB's live. */
WARNING: Block comments use a leading /* on a separate line
#503: FILE: tcg/region.c:415:
+/* Maximum size of the code gen buffer we'd like to use. Unless otherwise
WARNING: Block comments use * on subsequent lines
#504: FILE: tcg/region.c:416:
+/* Maximum size of the code gen buffer we'd like to use. Unless otherwise
+ indicated, this is constrained by the range of direct branches on the
WARNING: Block comments use a trailing */ on a separate line
#505: FILE: tcg/region.c:417:
+ host cpu, as used by the TCG implementation of goto_tb. */
WARNING: architecture specific defines should be avoided
#506: FILE: tcg/region.c:418:
+#if defined(__x86_64__)
WARNING: Block comments use a leading /* on a separate line
#520: FILE: tcg/region.c:432:
+ /* We have a 256MB branch region, but leave room to make sure the
WARNING: Block comments use * on subsequent lines
#521: FILE: tcg/region.c:433:
+ /* We have a 256MB branch region, but leave room to make sure the
+ main executable is also within that region. */
WARNING: Block comments use a trailing */ on a separate line
#521: FILE: tcg/region.c:433:
+ main executable is also within that region. */
WARNING: architecture specific defines should be avoided
#579: FILE: tcg/region.c:491:
+#ifdef __mips__
WARNING: Block comments use a leading /* on a separate line
#580: FILE: tcg/region.c:492:
+/* In order to use J and JAL within the code_gen_buffer, we require
WARNING: Block comments use * on subsequent lines
#581: FILE: tcg/region.c:493:
+/* In order to use J and JAL within the code_gen_buffer, we require
+ that the buffer not cross a 256MB boundary. */
WARNING: Block comments use a trailing */ on a separate line
#581: FILE: tcg/region.c:493:
+ that the buffer not cross a 256MB boundary. */
WARNING: Block comments use a leading /* on a separate line
#587: FILE: tcg/region.c:499:
+/* We weren't able to allocate a buffer without crossing that boundary,
WARNING: Block comments use * on subsequent lines
#588: FILE: tcg/region.c:500:
+/* We weren't able to allocate a buffer without crossing that boundary,
+ so make do with the larger portion of the buffer that doesn't cross.
WARNING: Block comments use a trailing */ on a separate line
#589: FILE: tcg/region.c:501:
+ Returns the new base of the buffer, and adjusts code_gen_buffer_size. */
WARNING: architecture specific defines should be avoided
#634: FILE: tcg/region.c:546:
+#ifdef __mips__
WARNING: architecture specific defines should be avoided
#686: FILE: tcg/region.c:598:
+#ifdef __mips__
WARNING: architecture specific defines should be avoided
#736: FILE: tcg/region.c:648:
+#ifdef __mips__
WARNING: architecture specific defines should be avoided
#753: FILE: tcg/region.c:665:
+#ifdef __mips__
ERROR: externs should be avoided in .c files
#795: FILE: tcg/region.c:707:
+extern kern_return_t mach_vm_remap(vm_map_t target_task,
total: 1 errors, 21 warnings, 895 lines checked
Patch 9/29 has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
10/29 Checking commit 4ab59ad7fbac (accel/tcg: Rename tcg_init to tcg_init_machine)
11/29 Checking commit 49401622f6bd (tcg: Create tcg_init)
12/29 Checking commit a1cd412ff253 (accel/tcg: Merge tcg_exec_init into tcg_init_machine)
WARNING: Block comments use a leading /* on a separate line
#56: FILE: accel/tcg/tcg-all.c:121:
+ /* There's no guest base to take into account, so go ahead and
WARNING: Block comments use * on subsequent lines
#57: FILE: accel/tcg/tcg-all.c:122:
+ /* There's no guest base to take into account, so go ahead and
+ initialize the prologue now. */
WARNING: Block comments use a trailing */ on a separate line
#57: FILE: accel/tcg/tcg-all.c:122:
+ initialize the prologue now. */
total: 0 errors, 3 warnings, 81 lines checked
Patch 12/29 has style problems, please review. If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
13/29 Checking commit d7bf2f6c5b84 (accel/tcg: Pass down max_cpus to tcg_init)
14/29 Checking commit 683f5af79dd7 (tcg: Introduce tcg_max_ctxs)
15/29 Checking commit 276ecb9d18cb (tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h)
16/29 Checking commit 9f3981e89b60 (tcg: Replace region.end with region.total_size)
17/29 Checking commit 218436d137c1 (tcg: Rename region.start to region.after_prologue)
18/29 Checking commit af03a0d81294 (tcg: Tidy tcg_n_regions)
19/29 Checking commit a6a064d88d21 (tcg: Tidy split_cross_256mb)
20/29 Checking commit 251d71e63d7f (tcg: Move in_code_gen_buffer and tests to region.c)
21/29 Checking commit 76220971b8cc (tcg: Allocate code_gen_buffer into struct tcg_region_state)
22/29 Checking commit 856c72493829 (tcg: Return the map protection from alloc_code_gen_buffer)
23/29 Checking commit 40483adb7b2e (tcg: Sink qemu_madvise call to common code)
24/29 Checking commit a1751c559ba8 (tcg: Do not set guard pages in the rx buffer)
25/29 Checking commit 4bdfded6d21a (util/osdep: Add qemu_mprotect_rw)
26/29 Checking commit 1e9c89999f44 (tcg: Round the tb_size default from qemu_get_host_physmem)
27/29 Checking commit 76e12ad880b1 (tcg: Merge buffer protection and guard page protection)
28/29 Checking commit ef1e2c0e7aed (tcg: When allocating for !splitwx, begin with PROT_NONE)
29/29 Checking commit 9906c07d1a1e (tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/)
=== OUTPUT END ===
Test command exited with code: 1
The full log is available at
http://patchew.org/logs/20210314212724.1917075-1-richard.henderson@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
` (29 preceding siblings ...)
2021-03-14 22:12 ` [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug no-reply
@ 2021-03-15 23:08 ` Roman Bolshakov
30 siblings, 0 replies; 38+ messages in thread
From: Roman Bolshakov @ 2021-03-15 23:08 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, j
On Sun, Mar 14, 2021 at 03:26:55PM -0600, Richard Henderson wrote:
> Changes for v2:
> * Move tcg_init_ctx someplace more private (patch 29)
> * Round result of tb_size based on qemu_get_host_physmem (patch 26)
>
> Blurb for v1:
> It took a few more patches than imagined to unify the two
> places in which we manipulate the tcg code_gen buffer, but
> the result is surely cleaner.
>
> There's a lot more that could be done to clean up this part
> of tcg too. I tried to not get too side-tracked, but didn't
> wholly succeed.
>
>
Hi Richard,
Thanks for doing the changes!
I'm not sure if I'll find enough time for thorough review but the series
helps qemu on Big Sur 11.2.3, so:
Tested-by: Roman Bolshakov <r.bolshakov@yadro.com>
Regards,
Roman
> r~
>
>
> Richard Henderson (29):
> meson: Split out tcg/meson.build
> meson: Split out fpu/meson.build
> tcg: Re-order tcg_region_init vs tcg_prologue_init
> tcg: Remove error return from tcg_region_initial_alloc__locked
> tcg: Split out tcg_region_initial_alloc
> tcg: Split out tcg_region_prologue_set
> tcg: Split out region.c
> accel/tcg: Inline cpu_gen_init
> accel/tcg: Move alloc_code_gen_buffer to tcg/region.c
> accel/tcg: Rename tcg_init to tcg_init_machine
> tcg: Create tcg_init
> accel/tcg: Merge tcg_exec_init into tcg_init_machine
> accel/tcg: Pass down max_cpus to tcg_init
> tcg: Introduce tcg_max_ctxs
> tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h
> tcg: Replace region.end with region.total_size
> tcg: Rename region.start to region.after_prologue
> tcg: Tidy tcg_n_regions
> tcg: Tidy split_cross_256mb
> tcg: Move in_code_gen_buffer and tests to region.c
> tcg: Allocate code_gen_buffer into struct tcg_region_state
> tcg: Return the map protection from alloc_code_gen_buffer
> tcg: Sink qemu_madvise call to common code
> tcg: Do not set guard pages in the rx buffer
> util/osdep: Add qemu_mprotect_rw
> tcg: Round the tb_size default from qemu_get_host_physmem
> tcg: Merge buffer protection and guard page protection
> tcg: When allocating for !splitwx, begin with PROT_NONE
> tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/
>
> meson.build | 13 +-
> accel/tcg/internal.h | 2 +
> include/qemu/osdep.h | 1 +
> include/sysemu/tcg.h | 2 -
> include/tcg/tcg.h | 15 +-
> tcg/aarch64/tcg-target.h | 1 +
> tcg/arm/tcg-target.h | 1 +
> tcg/i386/tcg-target.h | 2 +
> tcg/internal.h | 40 ++
> tcg/mips/tcg-target.h | 6 +
> tcg/ppc/tcg-target.h | 2 +
> tcg/riscv/tcg-target.h | 1 +
> tcg/s390/tcg-target.h | 3 +
> tcg/sparc/tcg-target.h | 1 +
> tcg/tci/tcg-target.h | 1 +
> accel/tcg/tcg-all.c | 33 +-
> accel/tcg/translate-all.c | 439 +----------------
> bsd-user/main.c | 1 -
> linux-user/main.c | 1 -
> tcg/region.c | 991 ++++++++++++++++++++++++++++++++++++++
> tcg/tcg.c | 634 ++----------------------
> util/osdep.c | 9 +
> fpu/meson.build | 1 +
> tcg/meson.build | 14 +
> 24 files changed, 1139 insertions(+), 1075 deletions(-)
> create mode 100644 tcg/internal.h
> create mode 100644 tcg/region.c
> create mode 100644 fpu/meson.build
> create mode 100644 tcg/meson.build
>
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 01/29] meson: Split out tcg/meson.build
2021-03-14 21:26 ` [PATCH v2 01/29] meson: Split out tcg/meson.build Richard Henderson
@ 2021-03-15 23:09 ` Roman Bolshakov
0 siblings, 0 replies; 38+ messages in thread
From: Roman Bolshakov @ 2021-03-15 23:09 UTC (permalink / raw)
To: Richard Henderson; +Cc: Philippe Mathieu-Daudé, qemu-devel, j
On Sun, Mar 14, 2021 at 03:26:56PM -0600, Richard Henderson wrote:
> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>
Thanks,
Roman
> meson.build | 9 ++-------
> tcg/meson.build | 13 +++++++++++++
> 2 files changed, 15 insertions(+), 7 deletions(-)
> create mode 100644 tcg/meson.build
>
> diff --git a/meson.build b/meson.build
> index a7d2dd429d..742f45c8d8 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1936,14 +1936,8 @@ specific_ss.add(files('cpu.c', 'disas.c', 'gdbstub.c'), capstone)
> specific_ss.add(files('exec-vary.c'))
> specific_ss.add(when: 'CONFIG_TCG', if_true: files(
> 'fpu/softfloat.c',
> - 'tcg/optimize.c',
> - 'tcg/tcg-common.c',
> - 'tcg/tcg-op-gvec.c',
> - 'tcg/tcg-op-vec.c',
> - 'tcg/tcg-op.c',
> - 'tcg/tcg.c',
> ))
> -specific_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('disas/tci.c', 'tcg/tci.c'))
> +specific_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('disas/tci.c'))
>
> subdir('backends')
> subdir('disas')
> @@ -1953,6 +1947,7 @@ subdir('net')
> subdir('replay')
> subdir('semihosting')
> subdir('hw')
> +subdir('tcg')
> subdir('accel')
> subdir('plugins')
> subdir('bsd-user')
> diff --git a/tcg/meson.build b/tcg/meson.build
> new file mode 100644
> index 0000000000..84064a341e
> --- /dev/null
> +++ b/tcg/meson.build
> @@ -0,0 +1,13 @@
> +tcg_ss = ss.source_set()
> +
> +tcg_ss.add(files(
> + 'optimize.c',
> + 'tcg.c',
> + 'tcg-common.c',
> + 'tcg-op.c',
> + 'tcg-op-gvec.c',
> + 'tcg-op-vec.c',
> +))
> +tcg_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('tci.c'))
> +
> +specific_ss.add_all(when: 'CONFIG_TCG', if_true: tcg_ss)
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 02/29] meson: Split out fpu/meson.build
2021-03-14 21:26 ` [PATCH v2 02/29] meson: Split out fpu/meson.build Richard Henderson
@ 2021-03-15 23:10 ` Roman Bolshakov
0 siblings, 0 replies; 38+ messages in thread
From: Roman Bolshakov @ 2021-03-15 23:10 UTC (permalink / raw)
To: Richard Henderson; +Cc: Philippe Mathieu-Daudé, qemu-devel, j
On Sun, Mar 14, 2021 at 03:26:57PM -0600, Richard Henderson wrote:
> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
Reviewed-by: Roman Bolshakov <r.boshakov@yadro.com>
Thanks,
Roman
> meson.build | 4 +---
> fpu/meson.build | 1 +
> 2 files changed, 2 insertions(+), 3 deletions(-)
> create mode 100644 fpu/meson.build
>
> diff --git a/meson.build b/meson.build
> index 742f45c8d8..bfa24b836e 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1934,9 +1934,6 @@ subdir('softmmu')
> common_ss.add(capstone)
> specific_ss.add(files('cpu.c', 'disas.c', 'gdbstub.c'), capstone)
> specific_ss.add(files('exec-vary.c'))
> -specific_ss.add(when: 'CONFIG_TCG', if_true: files(
> - 'fpu/softfloat.c',
> -))
> specific_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('disas/tci.c'))
>
> subdir('backends')
> @@ -1948,6 +1945,7 @@ subdir('replay')
> subdir('semihosting')
> subdir('hw')
> subdir('tcg')
> +subdir('fpu')
> subdir('accel')
> subdir('plugins')
> subdir('bsd-user')
> diff --git a/fpu/meson.build b/fpu/meson.build
> new file mode 100644
> index 0000000000..1a9992ded5
> --- /dev/null
> +++ b/fpu/meson.build
> @@ -0,0 +1 @@
> +specific_ss.add(when: 'CONFIG_TCG', if_true: files('softfloat.c'))
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init
2021-03-14 21:26 ` [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init Richard Henderson
@ 2021-03-15 23:37 ` Roman Bolshakov
2021-03-16 14:57 ` Richard Henderson
0 siblings, 1 reply; 38+ messages in thread
From: Roman Bolshakov @ 2021-03-15 23:37 UTC (permalink / raw)
To: Richard Henderson; +Cc: qemu-devel, j
On Sun, Mar 14, 2021 at 03:26:58PM -0600, Richard Henderson wrote:
> Instead of delaying tcg_region_init until after tcg_prologue_init
> is complete, do tcg_region_init first and let tcg_prologue_init
> shrink the first region by the size of the generated prologue.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/tcg-all.c | 11 ---------
> accel/tcg/translate-all.c | 3 +++
> bsd-user/main.c | 1 -
> linux-user/main.c | 1 -
> tcg/tcg.c | 52 ++++++++++++++-------------------------
> 5 files changed, 22 insertions(+), 46 deletions(-)
>
> diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
> index e378c2db73..f132033999 100644
> --- a/accel/tcg/tcg-all.c
> +++ b/accel/tcg/tcg-all.c
> @@ -111,17 +111,6 @@ static int tcg_init(MachineState *ms)
>
> tcg_exec_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
> mttcg_enabled = s->mttcg_enabled;
> -
> - /*
> - * Initialize TCG regions only for softmmu.
> - *
> - * This needs to be done later for user mode, because the prologue
> - * generation needs to be delayed so that GUEST_BASE is already set.
> - */
> -#ifndef CONFIG_USER_ONLY
> - tcg_region_init();
Note that tcg_region_init() invokes tcg_n_regions() that depends on
qemu_tcg_mttcg_enabled() that evaluates mttcg_enabled. Likely you need
to move "mttcg_enabled = s->mttcg_enabled;" before tcg_exec_init() to
keep existing behaviour.
> -#endif /* !CONFIG_USER_ONLY */
> -
> return 0;
> }
>
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index f32df8b240..b9057567f4 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -1339,6 +1339,9 @@ void tcg_exec_init(unsigned long tb_size, int splitwx)
> splitwx, &error_fatal);
> assert(ok);
>
> + /* TODO: allocating regions is hand-in-glove with code_gen_buffer. */
> + tcg_region_init();
> +
> #if defined(CONFIG_SOFTMMU)
> /* There's no guest base to take into account, so go ahead and
> initialize the prologue now. */
> diff --git a/bsd-user/main.c b/bsd-user/main.c
> index 798aba512c..3669d2b89e 100644
> --- a/bsd-user/main.c
> +++ b/bsd-user/main.c
> @@ -994,7 +994,6 @@ int main(int argc, char **argv)
> generating the prologue until now so that the prologue can take
> the real value of GUEST_BASE into account. */
> tcg_prologue_init(tcg_ctx);
> - tcg_region_init();
>
> /* build Task State */
> memset(ts, 0, sizeof(TaskState));
> diff --git a/linux-user/main.c b/linux-user/main.c
> index 4f4746dce8..1bc48ca954 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -850,7 +850,6 @@ int main(int argc, char **argv, char **envp)
> generating the prologue until now so that the prologue can take
> the real value of GUEST_BASE into account. */
> tcg_prologue_init(tcg_ctx);
> - tcg_region_init();
>
> target_cpu_copy_regs(env, regs);
>
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index 2991112829..0a2e5710de 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -1204,32 +1204,18 @@ TranslationBlock *tcg_tb_alloc(TCGContext *s)
>
> void tcg_prologue_init(TCGContext *s)
> {
> - size_t prologue_size, total_size;
> - void *buf0, *buf1;
> + size_t prologue_size;
>
> /* Put the prologue at the beginning of code_gen_buffer. */
> - buf0 = s->code_gen_buffer;
> - total_size = s->code_gen_buffer_size;
> - s->code_ptr = buf0;
> - s->code_buf = buf0;
> + tcg_region_assign(s, 0);
> + s->code_ptr = s->code_gen_ptr;
> + s->code_buf = s->code_gen_ptr;
Pardon me for asking a naive question, what's the difference between
s->code_buf and s->code_gen_buf and, respectively, s->code_ptr and
s->code_gen_ptr?
Thanks,
Roman
> s->data_gen_ptr = NULL;
>
> - /*
> - * The region trees are not yet configured, but tcg_splitwx_to_rx
> - * needs the bounds for an assert.
> - */
> - region.start = buf0;
> - region.end = buf0 + total_size;
> -
> #ifndef CONFIG_TCG_INTERPRETER
> - tcg_qemu_tb_exec = (tcg_prologue_fn *)tcg_splitwx_to_rx(buf0);
> + tcg_qemu_tb_exec = (tcg_prologue_fn *)tcg_splitwx_to_rx(s->code_ptr);
> #endif
>
> - /* Compute a high-water mark, at which we voluntarily flush the buffer
> - and start over. The size here is arbitrary, significantly larger
> - than we expect the code generation for any one opcode to require. */
> - s->code_gen_highwater = s->code_gen_buffer + (total_size - TCG_HIGHWATER);
> -
> #ifdef TCG_TARGET_NEED_POOL_LABELS
> s->pool_labels = NULL;
> #endif
> @@ -1246,32 +1232,32 @@ void tcg_prologue_init(TCGContext *s)
> }
> #endif
>
> - buf1 = s->code_ptr;
> + prologue_size = tcg_current_code_size(s);
> +
> #ifndef CONFIG_TCG_INTERPRETER
> - flush_idcache_range((uintptr_t)tcg_splitwx_to_rx(buf0), (uintptr_t)buf0,
> - tcg_ptr_byte_diff(buf1, buf0));
> + flush_idcache_range((uintptr_t)tcg_splitwx_to_rx(s->code_buf),
> + (uintptr_t)s->code_buf, prologue_size);
> #endif
>
> - /* Deduct the prologue from the buffer. */
> - prologue_size = tcg_current_code_size(s);
> - s->code_gen_ptr = buf1;
> - s->code_gen_buffer = buf1;
> - s->code_buf = buf1;
> - total_size -= prologue_size;
> - s->code_gen_buffer_size = total_size;
> + /* Deduct the prologue from the first region. */
> + region.start = s->code_ptr;
>
> - tcg_register_jit(tcg_splitwx_to_rx(s->code_gen_buffer), total_size);
> + /* Recompute boundaries of the first region. */
> + tcg_region_assign(s, 0);
> +
> + tcg_register_jit(tcg_splitwx_to_rx(region.start),
> + region.end - region.start);
>
> #ifdef DEBUG_DISAS
> if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
> FILE *logfile = qemu_log_lock();
> qemu_log("PROLOGUE: [size=%zu]\n", prologue_size);
> if (s->data_gen_ptr) {
> - size_t code_size = s->data_gen_ptr - buf0;
> + size_t code_size = s->data_gen_ptr - s->code_gen_ptr;
> size_t data_size = prologue_size - code_size;
> size_t i;
>
> - log_disas(buf0, code_size);
> + log_disas(s->code_gen_ptr, code_size);
>
> for (i = 0; i < data_size; i += sizeof(tcg_target_ulong)) {
> if (sizeof(tcg_target_ulong) == 8) {
> @@ -1285,7 +1271,7 @@ void tcg_prologue_init(TCGContext *s)
> }
> }
> } else {
> - log_disas(buf0, prologue_size);
> + log_disas(s->code_gen_ptr, prologue_size);
> }
> qemu_log("\n");
> qemu_log_flush();
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 38+ messages in thread
* Re: [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init
2021-03-15 23:37 ` Roman Bolshakov
@ 2021-03-16 14:57 ` Richard Henderson
0 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-16 14:57 UTC (permalink / raw)
To: Roman Bolshakov; +Cc: qemu-devel, j
On 3/15/21 5:37 PM, Roman Bolshakov wrote:
>> tcg_exec_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
>> mttcg_enabled = s->mttcg_enabled;
>> -
>> - /*
>> - * Initialize TCG regions only for softmmu.
>> - *
>> - * This needs to be done later for user mode, because the prologue
>> - * generation needs to be delayed so that GUEST_BASE is already set.
>> - */
>> -#ifndef CONFIG_USER_ONLY
>> - tcg_region_init();
>
> Note that tcg_region_init() invokes tcg_n_regions() that depends on
> qemu_tcg_mttcg_enabled() that evaluates mttcg_enabled. Likely you need
> to move "mttcg_enabled = s->mttcg_enabled;" before tcg_exec_init() to
> keep existing behaviour.
Yes indeed. This gets fixed in patch 12, which is why I didn't notice
breakage. Will adjust.
>> - total_size = s->code_gen_buffer_size;
>> - s->code_ptr = buf0;
>> - s->code_buf = buf0;
>> + tcg_region_assign(s, 0);
>> + s->code_ptr = s->code_gen_ptr;
>> + s->code_buf = s->code_gen_ptr;
>
> Pardon me for asking a naive question, what's the difference between
> s->code_buf and s->code_gen_buf and, respectively, s->code_ptr and
> s->code_gen_ptr?
I don't remember. I actually had it in my mind to rename all of these, remove
one or two that feel redundant, and document them all. But the patch set was
large enough already.
r~
^ permalink raw reply [flat|nested] 38+ messages in thread
end of thread, other threads:[~2021-03-16 14:58 UTC | newest]
Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
2021-03-14 21:26 ` [PATCH v2 01/29] meson: Split out tcg/meson.build Richard Henderson
2021-03-15 23:09 ` Roman Bolshakov
2021-03-14 21:26 ` [PATCH v2 02/29] meson: Split out fpu/meson.build Richard Henderson
2021-03-15 23:10 ` Roman Bolshakov
2021-03-14 21:26 ` [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init Richard Henderson
2021-03-15 23:37 ` Roman Bolshakov
2021-03-16 14:57 ` Richard Henderson
2021-03-14 21:26 ` [PATCH v2 04/29] tcg: Remove error return from tcg_region_initial_alloc__locked Richard Henderson
2021-03-14 21:27 ` [PATCH v2 05/29] tcg: Split out tcg_region_initial_alloc Richard Henderson
2021-03-14 21:27 ` [PATCH v2 06/29] tcg: Split out tcg_region_prologue_set Richard Henderson
2021-03-14 21:27 ` [PATCH v2 07/29] tcg: Split out region.c Richard Henderson
2021-03-14 21:27 ` [PATCH v2 08/29] accel/tcg: Inline cpu_gen_init Richard Henderson
2021-03-14 21:27 ` [PATCH v2 09/29] accel/tcg: Move alloc_code_gen_buffer to tcg/region.c Richard Henderson
2021-03-14 21:27 ` [PATCH v2 10/29] accel/tcg: Rename tcg_init to tcg_init_machine Richard Henderson
2021-03-14 21:27 ` [PATCH v2 11/29] tcg: Create tcg_init Richard Henderson
2021-03-14 21:27 ` [PATCH v2 12/29] accel/tcg: Merge tcg_exec_init into tcg_init_machine Richard Henderson
2021-03-14 21:27 ` [PATCH v2 13/29] accel/tcg: Pass down max_cpus to tcg_init Richard Henderson
2021-03-14 21:27 ` [PATCH v2 14/29] tcg: Introduce tcg_max_ctxs Richard Henderson
2021-03-14 21:27 ` [PATCH v2 15/29] tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h Richard Henderson
2021-03-14 21:27 ` [PATCH v2 16/29] tcg: Replace region.end with region.total_size Richard Henderson
2021-03-14 21:27 ` [PATCH v2 17/29] tcg: Rename region.start to region.after_prologue Richard Henderson
2021-03-14 21:27 ` [PATCH v2 18/29] tcg: Tidy tcg_n_regions Richard Henderson
2021-03-14 21:27 ` [PATCH v2 19/29] tcg: Tidy split_cross_256mb Richard Henderson
2021-03-14 21:27 ` [PATCH v2 20/29] tcg: Move in_code_gen_buffer and tests to region.c Richard Henderson
2021-03-14 21:27 ` [PATCH v2 21/29] tcg: Allocate code_gen_buffer into struct tcg_region_state Richard Henderson
2021-03-14 21:27 ` [PATCH v2 22/29] tcg: Return the map protection from alloc_code_gen_buffer Richard Henderson
2021-03-14 22:04 ` Philippe Mathieu-Daudé
2021-03-14 21:27 ` [PATCH v2 23/29] tcg: Sink qemu_madvise call to common code Richard Henderson
2021-03-14 21:27 ` [PATCH v2 24/29] tcg: Do not set guard pages in the rx buffer Richard Henderson
2021-03-14 21:27 ` [PATCH v2 25/29] util/osdep: Add qemu_mprotect_rw Richard Henderson
2021-03-14 21:27 ` [PATCH v2 26/29] tcg: Round the tb_size default from qemu_get_host_physmem Richard Henderson
2021-03-14 21:27 ` [PATCH v2 27/29] tcg: Merge buffer protection and guard page protection Richard Henderson
2021-03-14 21:27 ` [PATCH v2 28/29] tcg: When allocating for !splitwx, begin with PROT_NONE Richard Henderson
2021-03-14 21:27 ` [PATCH v2 29/29] tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/ Richard Henderson
2021-03-14 22:00 ` Philippe Mathieu-Daudé
2021-03-14 22:12 ` [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug no-reply
2021-03-15 23:08 ` Roman Bolshakov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).