qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug
@ 2021-03-14 21:26 Richard Henderson
  2021-03-14 21:26 ` [PATCH v2 01/29] meson: Split out tcg/meson.build Richard Henderson
                   ` (30 more replies)
  0 siblings, 31 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Changes for v2:
  * Move tcg_init_ctx someplace more private (patch 29)
  * Round result of tb_size based on qemu_get_host_physmem (patch 26)

Blurb for v1:
  It took a few more patches than imagined to unify the two
  places in which we manipulate the tcg code_gen buffer, but
  the result is surely cleaner.

  There's a lot more that could be done to clean up this part
  of tcg too.  I tried to not get too side-tracked, but didn't
  wholly succeed.


r~


Richard Henderson (29):
  meson: Split out tcg/meson.build
  meson: Split out fpu/meson.build
  tcg: Re-order tcg_region_init vs tcg_prologue_init
  tcg: Remove error return from tcg_region_initial_alloc__locked
  tcg: Split out tcg_region_initial_alloc
  tcg: Split out tcg_region_prologue_set
  tcg: Split out region.c
  accel/tcg: Inline cpu_gen_init
  accel/tcg: Move alloc_code_gen_buffer to tcg/region.c
  accel/tcg: Rename tcg_init to tcg_init_machine
  tcg: Create tcg_init
  accel/tcg: Merge tcg_exec_init into tcg_init_machine
  accel/tcg: Pass down max_cpus to tcg_init
  tcg: Introduce tcg_max_ctxs
  tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h
  tcg: Replace region.end with region.total_size
  tcg: Rename region.start to region.after_prologue
  tcg: Tidy tcg_n_regions
  tcg: Tidy split_cross_256mb
  tcg: Move in_code_gen_buffer and tests to region.c
  tcg: Allocate code_gen_buffer into struct tcg_region_state
  tcg: Return the map protection from alloc_code_gen_buffer
  tcg: Sink qemu_madvise call to common code
  tcg: Do not set guard pages in the rx buffer
  util/osdep: Add qemu_mprotect_rw
  tcg: Round the tb_size default from qemu_get_host_physmem
  tcg: Merge buffer protection and guard page protection
  tcg: When allocating for !splitwx, begin with PROT_NONE
  tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/

 meson.build               |  13 +-
 accel/tcg/internal.h      |   2 +
 include/qemu/osdep.h      |   1 +
 include/sysemu/tcg.h      |   2 -
 include/tcg/tcg.h         |  15 +-
 tcg/aarch64/tcg-target.h  |   1 +
 tcg/arm/tcg-target.h      |   1 +
 tcg/i386/tcg-target.h     |   2 +
 tcg/internal.h            |  40 ++
 tcg/mips/tcg-target.h     |   6 +
 tcg/ppc/tcg-target.h      |   2 +
 tcg/riscv/tcg-target.h    |   1 +
 tcg/s390/tcg-target.h     |   3 +
 tcg/sparc/tcg-target.h    |   1 +
 tcg/tci/tcg-target.h      |   1 +
 accel/tcg/tcg-all.c       |  33 +-
 accel/tcg/translate-all.c | 439 +----------------
 bsd-user/main.c           |   1 -
 linux-user/main.c         |   1 -
 tcg/region.c              | 991 ++++++++++++++++++++++++++++++++++++++
 tcg/tcg.c                 | 634 ++----------------------
 util/osdep.c              |   9 +
 fpu/meson.build           |   1 +
 tcg/meson.build           |  14 +
 24 files changed, 1139 insertions(+), 1075 deletions(-)
 create mode 100644 tcg/internal.h
 create mode 100644 tcg/region.c
 create mode 100644 fpu/meson.build
 create mode 100644 tcg/meson.build

-- 
2.25.1



^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2 01/29] meson: Split out tcg/meson.build
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
@ 2021-03-14 21:26 ` Richard Henderson
  2021-03-15 23:09   ` Roman Bolshakov
  2021-03-14 21:26 ` [PATCH v2 02/29] meson: Split out fpu/meson.build Richard Henderson
                   ` (29 subsequent siblings)
  30 siblings, 1 reply; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 meson.build     |  9 ++-------
 tcg/meson.build | 13 +++++++++++++
 2 files changed, 15 insertions(+), 7 deletions(-)
 create mode 100644 tcg/meson.build

diff --git a/meson.build b/meson.build
index a7d2dd429d..742f45c8d8 100644
--- a/meson.build
+++ b/meson.build
@@ -1936,14 +1936,8 @@ specific_ss.add(files('cpu.c', 'disas.c', 'gdbstub.c'), capstone)
 specific_ss.add(files('exec-vary.c'))
 specific_ss.add(when: 'CONFIG_TCG', if_true: files(
   'fpu/softfloat.c',
-  'tcg/optimize.c',
-  'tcg/tcg-common.c',
-  'tcg/tcg-op-gvec.c',
-  'tcg/tcg-op-vec.c',
-  'tcg/tcg-op.c',
-  'tcg/tcg.c',
 ))
-specific_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('disas/tci.c', 'tcg/tci.c'))
+specific_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('disas/tci.c'))
 
 subdir('backends')
 subdir('disas')
@@ -1953,6 +1947,7 @@ subdir('net')
 subdir('replay')
 subdir('semihosting')
 subdir('hw')
+subdir('tcg')
 subdir('accel')
 subdir('plugins')
 subdir('bsd-user')
diff --git a/tcg/meson.build b/tcg/meson.build
new file mode 100644
index 0000000000..84064a341e
--- /dev/null
+++ b/tcg/meson.build
@@ -0,0 +1,13 @@
+tcg_ss = ss.source_set()
+
+tcg_ss.add(files(
+  'optimize.c',
+  'tcg.c',
+  'tcg-common.c',
+  'tcg-op.c',
+  'tcg-op-gvec.c',
+  'tcg-op-vec.c',
+))
+tcg_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('tci.c'))
+
+specific_ss.add_all(when: 'CONFIG_TCG', if_true: tcg_ss)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 02/29] meson: Split out fpu/meson.build
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
  2021-03-14 21:26 ` [PATCH v2 01/29] meson: Split out tcg/meson.build Richard Henderson
@ 2021-03-14 21:26 ` Richard Henderson
  2021-03-15 23:10   ` Roman Bolshakov
  2021-03-14 21:26 ` [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init Richard Henderson
                   ` (28 subsequent siblings)
  30 siblings, 1 reply; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 meson.build     | 4 +---
 fpu/meson.build | 1 +
 2 files changed, 2 insertions(+), 3 deletions(-)
 create mode 100644 fpu/meson.build

diff --git a/meson.build b/meson.build
index 742f45c8d8..bfa24b836e 100644
--- a/meson.build
+++ b/meson.build
@@ -1934,9 +1934,6 @@ subdir('softmmu')
 common_ss.add(capstone)
 specific_ss.add(files('cpu.c', 'disas.c', 'gdbstub.c'), capstone)
 specific_ss.add(files('exec-vary.c'))
-specific_ss.add(when: 'CONFIG_TCG', if_true: files(
-  'fpu/softfloat.c',
-))
 specific_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('disas/tci.c'))
 
 subdir('backends')
@@ -1948,6 +1945,7 @@ subdir('replay')
 subdir('semihosting')
 subdir('hw')
 subdir('tcg')
+subdir('fpu')
 subdir('accel')
 subdir('plugins')
 subdir('bsd-user')
diff --git a/fpu/meson.build b/fpu/meson.build
new file mode 100644
index 0000000000..1a9992ded5
--- /dev/null
+++ b/fpu/meson.build
@@ -0,0 +1 @@
+specific_ss.add(when: 'CONFIG_TCG', if_true: files('softfloat.c'))
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
  2021-03-14 21:26 ` [PATCH v2 01/29] meson: Split out tcg/meson.build Richard Henderson
  2021-03-14 21:26 ` [PATCH v2 02/29] meson: Split out fpu/meson.build Richard Henderson
@ 2021-03-14 21:26 ` Richard Henderson
  2021-03-15 23:37   ` Roman Bolshakov
  2021-03-14 21:26 ` [PATCH v2 04/29] tcg: Remove error return from tcg_region_initial_alloc__locked Richard Henderson
                   ` (27 subsequent siblings)
  30 siblings, 1 reply; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Instead of delaying tcg_region_init until after tcg_prologue_init
is complete, do tcg_region_init first and let tcg_prologue_init
shrink the first region by the size of the generated prologue.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/tcg-all.c       | 11 ---------
 accel/tcg/translate-all.c |  3 +++
 bsd-user/main.c           |  1 -
 linux-user/main.c         |  1 -
 tcg/tcg.c                 | 52 ++++++++++++++-------------------------
 5 files changed, 22 insertions(+), 46 deletions(-)

diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index e378c2db73..f132033999 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -111,17 +111,6 @@ static int tcg_init(MachineState *ms)
 
     tcg_exec_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
     mttcg_enabled = s->mttcg_enabled;
-
-    /*
-     * Initialize TCG regions only for softmmu.
-     *
-     * This needs to be done later for user mode, because the prologue
-     * generation needs to be delayed so that GUEST_BASE is already set.
-     */
-#ifndef CONFIG_USER_ONLY
-    tcg_region_init();
-#endif /* !CONFIG_USER_ONLY */
-
     return 0;
 }
 
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index f32df8b240..b9057567f4 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1339,6 +1339,9 @@ void tcg_exec_init(unsigned long tb_size, int splitwx)
                                splitwx, &error_fatal);
     assert(ok);
 
+    /* TODO: allocating regions is hand-in-glove with code_gen_buffer. */
+    tcg_region_init();
+
 #if defined(CONFIG_SOFTMMU)
     /* There's no guest base to take into account, so go ahead and
        initialize the prologue now.  */
diff --git a/bsd-user/main.c b/bsd-user/main.c
index 798aba512c..3669d2b89e 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -994,7 +994,6 @@ int main(int argc, char **argv)
        generating the prologue until now so that the prologue can take
        the real value of GUEST_BASE into account.  */
     tcg_prologue_init(tcg_ctx);
-    tcg_region_init();
 
     /* build Task State */
     memset(ts, 0, sizeof(TaskState));
diff --git a/linux-user/main.c b/linux-user/main.c
index 4f4746dce8..1bc48ca954 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -850,7 +850,6 @@ int main(int argc, char **argv, char **envp)
        generating the prologue until now so that the prologue can take
        the real value of GUEST_BASE into account.  */
     tcg_prologue_init(tcg_ctx);
-    tcg_region_init();
 
     target_cpu_copy_regs(env, regs);
 
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 2991112829..0a2e5710de 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1204,32 +1204,18 @@ TranslationBlock *tcg_tb_alloc(TCGContext *s)
 
 void tcg_prologue_init(TCGContext *s)
 {
-    size_t prologue_size, total_size;
-    void *buf0, *buf1;
+    size_t prologue_size;
 
     /* Put the prologue at the beginning of code_gen_buffer.  */
-    buf0 = s->code_gen_buffer;
-    total_size = s->code_gen_buffer_size;
-    s->code_ptr = buf0;
-    s->code_buf = buf0;
+    tcg_region_assign(s, 0);
+    s->code_ptr = s->code_gen_ptr;
+    s->code_buf = s->code_gen_ptr;
     s->data_gen_ptr = NULL;
 
-    /*
-     * The region trees are not yet configured, but tcg_splitwx_to_rx
-     * needs the bounds for an assert.
-     */
-    region.start = buf0;
-    region.end = buf0 + total_size;
-
 #ifndef CONFIG_TCG_INTERPRETER
-    tcg_qemu_tb_exec = (tcg_prologue_fn *)tcg_splitwx_to_rx(buf0);
+    tcg_qemu_tb_exec = (tcg_prologue_fn *)tcg_splitwx_to_rx(s->code_ptr);
 #endif
 
-    /* Compute a high-water mark, at which we voluntarily flush the buffer
-       and start over.  The size here is arbitrary, significantly larger
-       than we expect the code generation for any one opcode to require.  */
-    s->code_gen_highwater = s->code_gen_buffer + (total_size - TCG_HIGHWATER);
-
 #ifdef TCG_TARGET_NEED_POOL_LABELS
     s->pool_labels = NULL;
 #endif
@@ -1246,32 +1232,32 @@ void tcg_prologue_init(TCGContext *s)
     }
 #endif
 
-    buf1 = s->code_ptr;
+    prologue_size = tcg_current_code_size(s);
+
 #ifndef CONFIG_TCG_INTERPRETER
-    flush_idcache_range((uintptr_t)tcg_splitwx_to_rx(buf0), (uintptr_t)buf0,
-                        tcg_ptr_byte_diff(buf1, buf0));
+    flush_idcache_range((uintptr_t)tcg_splitwx_to_rx(s->code_buf),
+                        (uintptr_t)s->code_buf, prologue_size);
 #endif
 
-    /* Deduct the prologue from the buffer.  */
-    prologue_size = tcg_current_code_size(s);
-    s->code_gen_ptr = buf1;
-    s->code_gen_buffer = buf1;
-    s->code_buf = buf1;
-    total_size -= prologue_size;
-    s->code_gen_buffer_size = total_size;
+    /* Deduct the prologue from the first region.  */
+    region.start = s->code_ptr;
 
-    tcg_register_jit(tcg_splitwx_to_rx(s->code_gen_buffer), total_size);
+    /* Recompute boundaries of the first region. */
+    tcg_region_assign(s, 0);
+
+    tcg_register_jit(tcg_splitwx_to_rx(region.start),
+                     region.end - region.start);
 
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
         FILE *logfile = qemu_log_lock();
         qemu_log("PROLOGUE: [size=%zu]\n", prologue_size);
         if (s->data_gen_ptr) {
-            size_t code_size = s->data_gen_ptr - buf0;
+            size_t code_size = s->data_gen_ptr - s->code_gen_ptr;
             size_t data_size = prologue_size - code_size;
             size_t i;
 
-            log_disas(buf0, code_size);
+            log_disas(s->code_gen_ptr, code_size);
 
             for (i = 0; i < data_size; i += sizeof(tcg_target_ulong)) {
                 if (sizeof(tcg_target_ulong) == 8) {
@@ -1285,7 +1271,7 @@ void tcg_prologue_init(TCGContext *s)
                 }
             }
         } else {
-            log_disas(buf0, prologue_size);
+            log_disas(s->code_gen_ptr, prologue_size);
         }
         qemu_log("\n");
         qemu_log_flush();
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 04/29] tcg: Remove error return from tcg_region_initial_alloc__locked
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (2 preceding siblings ...)
  2021-03-14 21:26 ` [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init Richard Henderson
@ 2021-03-14 21:26 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 05/29] tcg: Split out tcg_region_initial_alloc Richard Henderson
                   ` (26 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:26 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé

All callers immediately assert on error, so move the assert
into the function itself.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/tcg.c | 19 ++++++-------------
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 0a2e5710de..2b631fccdf 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -720,9 +720,10 @@ static bool tcg_region_alloc(TCGContext *s)
  * Perform a context's first region allocation.
  * This function does _not_ increment region.agg_size_full.
  */
-static inline bool tcg_region_initial_alloc__locked(TCGContext *s)
+static void tcg_region_initial_alloc__locked(TCGContext *s)
 {
-    return tcg_region_alloc__locked(s);
+    bool err = tcg_region_alloc__locked(s);
+    g_assert(!err);
 }
 
 /* Call from a safe-work context */
@@ -737,9 +738,7 @@ void tcg_region_reset_all(void)
 
     for (i = 0; i < n_ctxs; i++) {
         TCGContext *s = qatomic_read(&tcg_ctxs[i]);
-        bool err = tcg_region_initial_alloc__locked(s);
-
-        g_assert(!err);
+        tcg_region_initial_alloc__locked(s);
     }
     qemu_mutex_unlock(&region.lock);
 
@@ -874,11 +873,7 @@ void tcg_region_init(void)
 
     /* In user-mode we support only one ctx, so do the initial allocation now */
 #ifdef CONFIG_USER_ONLY
-    {
-        bool err = tcg_region_initial_alloc__locked(tcg_ctx);
-
-        g_assert(!err);
-    }
+    tcg_region_initial_alloc__locked(tcg_ctx);
 #endif
 }
 
@@ -940,7 +935,6 @@ void tcg_register_thread(void)
     MachineState *ms = MACHINE(qdev_get_machine());
     TCGContext *s = g_malloc(sizeof(*s));
     unsigned int i, n;
-    bool err;
 
     *s = tcg_init_ctx;
 
@@ -964,8 +958,7 @@ void tcg_register_thread(void)
 
     tcg_ctx = s;
     qemu_mutex_lock(&region.lock);
-    err = tcg_region_initial_alloc__locked(tcg_ctx);
-    g_assert(!err);
+    tcg_region_initial_alloc__locked(s);
     qemu_mutex_unlock(&region.lock);
 }
 #endif /* !CONFIG_USER_ONLY */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 05/29] tcg: Split out tcg_region_initial_alloc
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (3 preceding siblings ...)
  2021-03-14 21:26 ` [PATCH v2 04/29] tcg: Remove error return from tcg_region_initial_alloc__locked Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 06/29] tcg: Split out tcg_region_prologue_set Richard Henderson
                   ` (25 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

This has only one user, and currently needs an ifdef,
but will make more sense after some code motion.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/tcg.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 2b631fccdf..3316a22bde 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -726,6 +726,15 @@ static void tcg_region_initial_alloc__locked(TCGContext *s)
     g_assert(!err);
 }
 
+#ifndef CONFIG_USER_ONLY
+static void tcg_region_initial_alloc(TCGContext *s)
+{
+    qemu_mutex_lock(&region.lock);
+    tcg_region_initial_alloc__locked(s);
+    qemu_mutex_unlock(&region.lock);
+}
+#endif
+
 /* Call from a safe-work context */
 void tcg_region_reset_all(void)
 {
@@ -957,9 +966,7 @@ void tcg_register_thread(void)
     }
 
     tcg_ctx = s;
-    qemu_mutex_lock(&region.lock);
-    tcg_region_initial_alloc__locked(s);
-    qemu_mutex_unlock(&region.lock);
+    tcg_region_initial_alloc(s);
 }
 #endif /* !CONFIG_USER_ONLY */
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 06/29] tcg: Split out tcg_region_prologue_set
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (4 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 05/29] tcg: Split out tcg_region_initial_alloc Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 07/29] tcg: Split out region.c Richard Henderson
                   ` (24 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

This has only one user, but will make more sense after some
code motion.

Always leave the tcg_init_ctx initialized to the first region,
in preparation for tcg_prologue_init().  This also requires
that we don't re-allocate the region for the first cpu, lest
we hit the assertion for total number of regions allocated .

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/tcg.c | 37 ++++++++++++++++++++++---------------
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 3316a22bde..5b3525d52a 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -880,10 +880,26 @@ void tcg_region_init(void)
 
     tcg_region_trees_init();
 
-    /* In user-mode we support only one ctx, so do the initial allocation now */
-#ifdef CONFIG_USER_ONLY
-    tcg_region_initial_alloc__locked(tcg_ctx);
-#endif
+    /*
+     * Leave the initial context initialized to the first region.
+     * This will be the context into which we generate the prologue.
+     * It is also the only context for CONFIG_USER_ONLY.
+     */
+    tcg_region_initial_alloc__locked(&tcg_init_ctx);
+}
+
+static void tcg_region_prologue_set(TCGContext *s)
+{
+    /* Deduct the prologue from the first region.  */
+    g_assert(region.start == s->code_gen_buffer);
+    region.start = s->code_ptr;
+
+    /* Recompute boundaries of the first region. */
+    tcg_region_assign(s, 0);
+
+    /* Register the balance of the buffer with gdb. */
+    tcg_register_jit(tcg_splitwx_to_rx(region.start),
+                     region.end - region.start);
 }
 
 #ifdef CONFIG_DEBUG_TCG
@@ -963,10 +979,10 @@ void tcg_register_thread(void)
 
     if (n > 0) {
         alloc_tcg_plugin_context(s);
+        tcg_region_initial_alloc(s);
     }
 
     tcg_ctx = s;
-    tcg_region_initial_alloc(s);
 }
 #endif /* !CONFIG_USER_ONLY */
 
@@ -1206,8 +1222,6 @@ void tcg_prologue_init(TCGContext *s)
 {
     size_t prologue_size;
 
-    /* Put the prologue at the beginning of code_gen_buffer.  */
-    tcg_region_assign(s, 0);
     s->code_ptr = s->code_gen_ptr;
     s->code_buf = s->code_gen_ptr;
     s->data_gen_ptr = NULL;
@@ -1239,14 +1253,7 @@ void tcg_prologue_init(TCGContext *s)
                         (uintptr_t)s->code_buf, prologue_size);
 #endif
 
-    /* Deduct the prologue from the first region.  */
-    region.start = s->code_ptr;
-
-    /* Recompute boundaries of the first region. */
-    tcg_region_assign(s, 0);
-
-    tcg_register_jit(tcg_splitwx_to_rx(region.start),
-                     region.end - region.start);
+    tcg_region_prologue_set(s);
 
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 07/29] tcg: Split out region.c
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (5 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 06/29] tcg: Split out tcg_region_prologue_set Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 08/29] accel/tcg: Inline cpu_gen_init Richard Henderson
                   ` (23 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/internal.h  |  37 ++++
 tcg/region.c    | 570 ++++++++++++++++++++++++++++++++++++++++++++++++
 tcg/tcg.c       | 545 +--------------------------------------------
 tcg/meson.build |   1 +
 4 files changed, 611 insertions(+), 542 deletions(-)
 create mode 100644 tcg/internal.h
 create mode 100644 tcg/region.c

diff --git a/tcg/internal.h b/tcg/internal.h
new file mode 100644
index 0000000000..b1dda343c2
--- /dev/null
+++ b/tcg/internal.h
@@ -0,0 +1,37 @@
+/*
+ * Internal declarations for Tiny Code Generator for QEMU
+ *
+ * Copyright (c) 2008 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#ifndef TCG_INTERNAL_H
+#define TCG_INTERNAL_H 1
+
+#define TCG_HIGHWATER 1024
+
+extern TCGContext **tcg_ctxs;
+extern unsigned int n_tcg_ctxs;
+
+bool tcg_region_alloc(TCGContext *s);
+void tcg_region_initial_alloc(TCGContext *s);
+void tcg_region_prologue_set(TCGContext *s);
+
+#endif /* TCG_INTERNAL_H */
diff --git a/tcg/region.c b/tcg/region.c
new file mode 100644
index 0000000000..af45a0174e
--- /dev/null
+++ b/tcg/region.c
@@ -0,0 +1,570 @@
+/*
+ * Memory region management for Tiny Code Generator for QEMU
+ *
+ * Copyright (c) 2008 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "exec/exec-all.h"
+#include "tcg/tcg.h"
+#if !defined(CONFIG_USER_ONLY)
+#include "hw/boards.h"
+#endif
+#include "internal.h"
+
+
+struct tcg_region_tree {
+    QemuMutex lock;
+    GTree *tree;
+    /* padding to avoid false sharing is computed at run-time */
+};
+
+/*
+ * We divide code_gen_buffer into equally-sized "regions" that TCG threads
+ * dynamically allocate from as demand dictates. Given appropriate region
+ * sizing, this minimizes flushes even when some TCG threads generate a lot
+ * more code than others.
+ */
+struct tcg_region_state {
+    QemuMutex lock;
+
+    /* fields set at init time */
+    void *start;
+    void *start_aligned;
+    void *end;
+    size_t n;
+    size_t size; /* size of one region */
+    size_t stride; /* .size + guard size */
+
+    /* fields protected by the lock */
+    size_t current; /* current region index */
+    size_t agg_size_full; /* aggregate size of full regions */
+};
+
+static struct tcg_region_state region;
+
+/*
+ * This is an array of struct tcg_region_tree's, with padding.
+ * We use void * to simplify the computation of region_trees[i]; each
+ * struct is found every tree_size bytes.
+ */
+static void *region_trees;
+static size_t tree_size;
+
+/* compare a pointer @ptr and a tb_tc @s */
+static int ptr_cmp_tb_tc(const void *ptr, const struct tb_tc *s)
+{
+    if (ptr >= s->ptr + s->size) {
+        return 1;
+    } else if (ptr < s->ptr) {
+        return -1;
+    }
+    return 0;
+}
+
+static gint tb_tc_cmp(gconstpointer ap, gconstpointer bp)
+{
+    const struct tb_tc *a = ap;
+    const struct tb_tc *b = bp;
+
+    /*
+     * When both sizes are set, we know this isn't a lookup.
+     * This is the most likely case: every TB must be inserted; lookups
+     * are a lot less frequent.
+     */
+    if (likely(a->size && b->size)) {
+        if (a->ptr > b->ptr) {
+            return 1;
+        } else if (a->ptr < b->ptr) {
+            return -1;
+        }
+        /* a->ptr == b->ptr should happen only on deletions */
+        g_assert(a->size == b->size);
+        return 0;
+    }
+    /*
+     * All lookups have either .size field set to 0.
+     * From the glib sources we see that @ap is always the lookup key. However
+     * the docs provide no guarantee, so we just mark this case as likely.
+     */
+    if (likely(a->size == 0)) {
+        return ptr_cmp_tb_tc(a->ptr, b);
+    }
+    return ptr_cmp_tb_tc(b->ptr, a);
+}
+
+static void tcg_region_trees_init(void)
+{
+    size_t i;
+
+    tree_size = ROUND_UP(sizeof(struct tcg_region_tree), qemu_dcache_linesize);
+    region_trees = qemu_memalign(qemu_dcache_linesize, region.n * tree_size);
+    for (i = 0; i < region.n; i++) {
+        struct tcg_region_tree *rt = region_trees + i * tree_size;
+
+        qemu_mutex_init(&rt->lock);
+        rt->tree = g_tree_new(tb_tc_cmp);
+    }
+}
+
+static struct tcg_region_tree *tc_ptr_to_region_tree(const void *p)
+{
+    size_t region_idx;
+
+    /*
+     * Like tcg_splitwx_to_rw, with no assert.  The pc may come from
+     * a signal handler over which the caller has no control.
+     */
+    if (!in_code_gen_buffer(p)) {
+        p -= tcg_splitwx_diff;
+        if (!in_code_gen_buffer(p)) {
+            return NULL;
+        }
+    }
+
+    if (p < region.start_aligned) {
+        region_idx = 0;
+    } else {
+        ptrdiff_t offset = p - region.start_aligned;
+
+        if (offset > region.stride * (region.n - 1)) {
+            region_idx = region.n - 1;
+        } else {
+            region_idx = offset / region.stride;
+        }
+    }
+    return region_trees + region_idx * tree_size;
+}
+
+void tcg_tb_insert(TranslationBlock *tb)
+{
+    struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr);
+
+    g_assert(rt != NULL);
+    qemu_mutex_lock(&rt->lock);
+    g_tree_insert(rt->tree, &tb->tc, tb);
+    qemu_mutex_unlock(&rt->lock);
+}
+
+void tcg_tb_remove(TranslationBlock *tb)
+{
+    struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr);
+
+    g_assert(rt != NULL);
+    qemu_mutex_lock(&rt->lock);
+    g_tree_remove(rt->tree, &tb->tc);
+    qemu_mutex_unlock(&rt->lock);
+}
+
+/*
+ * Find the TB 'tb' such that
+ * tb->tc.ptr <= tc_ptr < tb->tc.ptr + tb->tc.size
+ * Return NULL if not found.
+ */
+TranslationBlock *tcg_tb_lookup(uintptr_t tc_ptr)
+{
+    struct tcg_region_tree *rt = tc_ptr_to_region_tree((void *)tc_ptr);
+    TranslationBlock *tb;
+    struct tb_tc s = { .ptr = (void *)tc_ptr };
+
+    if (rt == NULL) {
+        return NULL;
+    }
+
+    qemu_mutex_lock(&rt->lock);
+    tb = g_tree_lookup(rt->tree, &s);
+    qemu_mutex_unlock(&rt->lock);
+    return tb;
+}
+
+static void tcg_region_tree_lock_all(void)
+{
+    size_t i;
+
+    for (i = 0; i < region.n; i++) {
+        struct tcg_region_tree *rt = region_trees + i * tree_size;
+
+        qemu_mutex_lock(&rt->lock);
+    }
+}
+
+static void tcg_region_tree_unlock_all(void)
+{
+    size_t i;
+
+    for (i = 0; i < region.n; i++) {
+        struct tcg_region_tree *rt = region_trees + i * tree_size;
+
+        qemu_mutex_unlock(&rt->lock);
+    }
+}
+
+void tcg_tb_foreach(GTraverseFunc func, gpointer user_data)
+{
+    size_t i;
+
+    tcg_region_tree_lock_all();
+    for (i = 0; i < region.n; i++) {
+        struct tcg_region_tree *rt = region_trees + i * tree_size;
+
+        g_tree_foreach(rt->tree, func, user_data);
+    }
+    tcg_region_tree_unlock_all();
+}
+
+size_t tcg_nb_tbs(void)
+{
+    size_t nb_tbs = 0;
+    size_t i;
+
+    tcg_region_tree_lock_all();
+    for (i = 0; i < region.n; i++) {
+        struct tcg_region_tree *rt = region_trees + i * tree_size;
+
+        nb_tbs += g_tree_nnodes(rt->tree);
+    }
+    tcg_region_tree_unlock_all();
+    return nb_tbs;
+}
+
+static gboolean tcg_region_tree_traverse(gpointer k, gpointer v, gpointer data)
+{
+    TranslationBlock *tb = v;
+
+    tb_destroy(tb);
+    return FALSE;
+}
+
+static void tcg_region_tree_reset_all(void)
+{
+    size_t i;
+
+    tcg_region_tree_lock_all();
+    for (i = 0; i < region.n; i++) {
+        struct tcg_region_tree *rt = region_trees + i * tree_size;
+
+        g_tree_foreach(rt->tree, tcg_region_tree_traverse, NULL);
+        /* Increment the refcount first so that destroy acts as a reset */
+        g_tree_ref(rt->tree);
+        g_tree_destroy(rt->tree);
+    }
+    tcg_region_tree_unlock_all();
+}
+
+static void tcg_region_bounds(size_t curr_region, void **pstart, void **pend)
+{
+    void *start, *end;
+
+    start = region.start_aligned + curr_region * region.stride;
+    end = start + region.size;
+
+    if (curr_region == 0) {
+        start = region.start;
+    }
+    if (curr_region == region.n - 1) {
+        end = region.end;
+    }
+
+    *pstart = start;
+    *pend = end;
+}
+
+static void tcg_region_assign(TCGContext *s, size_t curr_region)
+{
+    void *start, *end;
+
+    tcg_region_bounds(curr_region, &start, &end);
+
+    s->code_gen_buffer = start;
+    s->code_gen_ptr = start;
+    s->code_gen_buffer_size = end - start;
+    s->code_gen_highwater = end - TCG_HIGHWATER;
+}
+
+static bool tcg_region_alloc__locked(TCGContext *s)
+{
+    if (region.current == region.n) {
+        return true;
+    }
+    tcg_region_assign(s, region.current);
+    region.current++;
+    return false;
+}
+
+/*
+ * Request a new region once the one in use has filled up.
+ * Returns true on error.
+ */
+bool tcg_region_alloc(TCGContext *s)
+{
+    bool err;
+    /* read the region size now; alloc__locked will overwrite it on success */
+    size_t size_full = s->code_gen_buffer_size;
+
+    qemu_mutex_lock(&region.lock);
+    err = tcg_region_alloc__locked(s);
+    if (!err) {
+        region.agg_size_full += size_full - TCG_HIGHWATER;
+    }
+    qemu_mutex_unlock(&region.lock);
+    return err;
+}
+
+/*
+ * Perform a context's first region allocation.
+ * This function does _not_ increment region.agg_size_full.
+ */
+static void tcg_region_initial_alloc__locked(TCGContext *s)
+{
+    bool err = tcg_region_alloc__locked(s);
+    g_assert(!err);
+}
+
+void tcg_region_initial_alloc(TCGContext *s)
+{
+    qemu_mutex_lock(&region.lock);
+    tcg_region_initial_alloc__locked(s);
+    qemu_mutex_unlock(&region.lock);
+}
+
+/* Call from a safe-work context */
+void tcg_region_reset_all(void)
+{
+    unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+    unsigned int i;
+
+    qemu_mutex_lock(&region.lock);
+    region.current = 0;
+    region.agg_size_full = 0;
+
+    for (i = 0; i < n_ctxs; i++) {
+        TCGContext *s = qatomic_read(&tcg_ctxs[i]);
+        tcg_region_initial_alloc__locked(s);
+    }
+    qemu_mutex_unlock(&region.lock);
+
+    tcg_region_tree_reset_all();
+}
+
+#ifdef CONFIG_USER_ONLY
+static size_t tcg_n_regions(void)
+{
+    return 1;
+}
+#else
+/*
+ * It is likely that some vCPUs will translate more code than others, so we
+ * first try to set more regions than max_cpus, with those regions being of
+ * reasonable size. If that's not possible we make do by evenly dividing
+ * the code_gen_buffer among the vCPUs.
+ */
+static size_t tcg_n_regions(void)
+{
+    size_t i;
+
+    /* Use a single region if all we have is one vCPU thread */
+#if !defined(CONFIG_USER_ONLY)
+    MachineState *ms = MACHINE(qdev_get_machine());
+    unsigned int max_cpus = ms->smp.max_cpus;
+#endif
+    if (max_cpus == 1 || !qemu_tcg_mttcg_enabled()) {
+        return 1;
+    }
+
+    /* Try to have more regions than max_cpus, with each region being >= 2 MB */
+    for (i = 8; i > 0; i--) {
+        size_t regions_per_thread = i;
+        size_t region_size;
+
+        region_size = tcg_init_ctx.code_gen_buffer_size;
+        region_size /= max_cpus * regions_per_thread;
+
+        if (region_size >= 2 * 1024u * 1024) {
+            return max_cpus * regions_per_thread;
+        }
+    }
+    /* If we can't, then just allocate one region per vCPU thread */
+    return max_cpus;
+}
+#endif
+
+/*
+ * Initializes region partitioning.
+ *
+ * Called at init time from the parent thread (i.e. the one calling
+ * tcg_context_init), after the target's TCG globals have been set.
+ *
+ * Region partitioning works by splitting code_gen_buffer into separate regions,
+ * and then assigning regions to TCG threads so that the threads can translate
+ * code in parallel without synchronization.
+ *
+ * In softmmu the number of TCG threads is bounded by max_cpus, so we use at
+ * least max_cpus regions in MTTCG. In !MTTCG we use a single region.
+ * Note that the TCG options from the command-line (i.e. -accel accel=tcg,[...])
+ * must have been parsed before calling this function, since it calls
+ * qemu_tcg_mttcg_enabled().
+ *
+ * In user-mode we use a single region.  Having multiple regions in user-mode
+ * is not supported, because the number of vCPU threads (recall that each thread
+ * spawned by the guest corresponds to a vCPU thread) is only bounded by the
+ * OS, and usually this number is huge (tens of thousands is not uncommon).
+ * Thus, given this large bound on the number of vCPU threads and the fact
+ * that code_gen_buffer is allocated at compile-time, we cannot guarantee
+ * that the availability of at least one region per vCPU thread.
+ *
+ * However, this user-mode limitation is unlikely to be a significant problem
+ * in practice. Multi-threaded guests share most if not all of their translated
+ * code, which makes parallel code generation less appealing than in softmmu.
+ */
+void tcg_region_init(void)
+{
+    void *buf = tcg_init_ctx.code_gen_buffer;
+    void *aligned;
+    size_t size = tcg_init_ctx.code_gen_buffer_size;
+    size_t page_size = qemu_real_host_page_size;
+    size_t region_size;
+    size_t n_regions;
+    size_t i;
+    uintptr_t splitwx_diff;
+
+    n_regions = tcg_n_regions();
+
+    /* The first region will be 'aligned - buf' bytes larger than the others */
+    aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
+    g_assert(aligned < tcg_init_ctx.code_gen_buffer + size);
+    /*
+     * Make region_size a multiple of page_size, using aligned as the start.
+     * As a result of this we might end up with a few extra pages at the end of
+     * the buffer; we will assign those to the last region.
+     */
+    region_size = (size - (aligned - buf)) / n_regions;
+    region_size = QEMU_ALIGN_DOWN(region_size, page_size);
+
+    /* A region must have at least 2 pages; one code, one guard */
+    g_assert(region_size >= 2 * page_size);
+
+    /* init the region struct */
+    qemu_mutex_init(&region.lock);
+    region.n = n_regions;
+    region.size = region_size - page_size;
+    region.stride = region_size;
+    region.start = buf;
+    region.start_aligned = aligned;
+    /* page-align the end, since its last page will be a guard page */
+    region.end = QEMU_ALIGN_PTR_DOWN(buf + size, page_size);
+    /* account for that last guard page */
+    region.end -= page_size;
+
+    /* set guard pages */
+    splitwx_diff = tcg_splitwx_diff;
+    for (i = 0; i < region.n; i++) {
+        void *start, *end;
+        int rc;
+
+        tcg_region_bounds(i, &start, &end);
+        rc = qemu_mprotect_none(end, page_size);
+        g_assert(!rc);
+        if (splitwx_diff) {
+            rc = qemu_mprotect_none(end + splitwx_diff, page_size);
+            g_assert(!rc);
+        }
+    }
+
+    tcg_region_trees_init();
+
+    /*
+     * Leave the initial context initialized to the first region.
+     * This will be the context into which we generate the prologue.
+     * It is also the only context for CONFIG_USER_ONLY.
+     */
+    tcg_region_initial_alloc__locked(&tcg_init_ctx);
+}
+
+void tcg_region_prologue_set(TCGContext *s)
+{
+    /* Deduct the prologue from the first region.  */
+    g_assert(region.start == s->code_gen_buffer);
+    region.start = s->code_ptr;
+
+    /* Recompute boundaries of the first region. */
+    tcg_region_assign(s, 0);
+
+    /* Register the balance of the buffer with gdb. */
+    tcg_register_jit(tcg_splitwx_to_rx(region.start),
+                     region.end - region.start);
+}
+
+/*
+ * Returns the size (in bytes) of all translated code (i.e. from all regions)
+ * currently in the cache.
+ * See also: tcg_code_capacity()
+ * Do not confuse with tcg_current_code_size(); that one applies to a single
+ * TCG context.
+ */
+size_t tcg_code_size(void)
+{
+    unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+    unsigned int i;
+    size_t total;
+
+    qemu_mutex_lock(&region.lock);
+    total = region.agg_size_full;
+    for (i = 0; i < n_ctxs; i++) {
+        const TCGContext *s = qatomic_read(&tcg_ctxs[i]);
+        size_t size;
+
+        size = qatomic_read(&s->code_gen_ptr) - s->code_gen_buffer;
+        g_assert(size <= s->code_gen_buffer_size);
+        total += size;
+    }
+    qemu_mutex_unlock(&region.lock);
+    return total;
+}
+
+/*
+ * Returns the code capacity (in bytes) of the entire cache, i.e. including all
+ * regions.
+ * See also: tcg_code_size()
+ */
+size_t tcg_code_capacity(void)
+{
+    size_t guard_size, capacity;
+
+    /* no need for synchronization; these variables are set at init time */
+    guard_size = region.stride - region.size;
+    capacity = region.end + guard_size - region.start;
+    capacity -= region.n * (guard_size + TCG_HIGHWATER);
+    return capacity;
+}
+
+size_t tcg_tb_phys_invalidate_count(void)
+{
+    unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+    unsigned int i;
+    size_t total = 0;
+
+    for (i = 0; i < n_ctxs; i++) {
+        const TCGContext *s = qatomic_read(&tcg_ctxs[i]);
+
+        total += qatomic_read(&s->tb_phys_invalidate_count);
+    }
+    return total;
+}
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 5b3525d52a..10a571d41c 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -65,6 +65,7 @@
 #include "elf.h"
 #include "exec/log.h"
 #include "sysemu/sysemu.h"
+#include "internal.h"
 
 /* Forward declarations for functions declared in tcg-target.c.inc and
    used here. */
@@ -153,10 +154,8 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type,
 static int tcg_out_ldst_finalize(TCGContext *s);
 #endif
 
-#define TCG_HIGHWATER 1024
-
-static TCGContext **tcg_ctxs;
-static unsigned int n_tcg_ctxs;
+TCGContext **tcg_ctxs;
+unsigned int n_tcg_ctxs;
 TCGv_env cpu_env = 0;
 const void *tcg_code_gen_epilogue;
 uintptr_t tcg_splitwx_diff;
@@ -165,42 +164,6 @@ uintptr_t tcg_splitwx_diff;
 tcg_prologue_fn *tcg_qemu_tb_exec;
 #endif
 
-struct tcg_region_tree {
-    QemuMutex lock;
-    GTree *tree;
-    /* padding to avoid false sharing is computed at run-time */
-};
-
-/*
- * We divide code_gen_buffer into equally-sized "regions" that TCG threads
- * dynamically allocate from as demand dictates. Given appropriate region
- * sizing, this minimizes flushes even when some TCG threads generate a lot
- * more code than others.
- */
-struct tcg_region_state {
-    QemuMutex lock;
-
-    /* fields set at init time */
-    void *start;
-    void *start_aligned;
-    void *end;
-    size_t n;
-    size_t size; /* size of one region */
-    size_t stride; /* .size + guard size */
-
-    /* fields protected by the lock */
-    size_t current; /* current region index */
-    size_t agg_size_full; /* aggregate size of full regions */
-};
-
-static struct tcg_region_state region;
-/*
- * This is an array of struct tcg_region_tree's, with padding.
- * We use void * to simplify the computation of region_trees[i]; each
- * struct is found every tree_size bytes.
- */
-static void *region_trees;
-static size_t tree_size;
 static TCGRegSet tcg_target_available_regs[TCG_TYPE_COUNT];
 static TCGRegSet tcg_target_call_clobber_regs;
 
@@ -457,451 +420,6 @@ static const TCGTargetOpDef constraint_sets[] = {
 
 #include "tcg-target.c.inc"
 
-/* compare a pointer @ptr and a tb_tc @s */
-static int ptr_cmp_tb_tc(const void *ptr, const struct tb_tc *s)
-{
-    if (ptr >= s->ptr + s->size) {
-        return 1;
-    } else if (ptr < s->ptr) {
-        return -1;
-    }
-    return 0;
-}
-
-static gint tb_tc_cmp(gconstpointer ap, gconstpointer bp)
-{
-    const struct tb_tc *a = ap;
-    const struct tb_tc *b = bp;
-
-    /*
-     * When both sizes are set, we know this isn't a lookup.
-     * This is the most likely case: every TB must be inserted; lookups
-     * are a lot less frequent.
-     */
-    if (likely(a->size && b->size)) {
-        if (a->ptr > b->ptr) {
-            return 1;
-        } else if (a->ptr < b->ptr) {
-            return -1;
-        }
-        /* a->ptr == b->ptr should happen only on deletions */
-        g_assert(a->size == b->size);
-        return 0;
-    }
-    /*
-     * All lookups have either .size field set to 0.
-     * From the glib sources we see that @ap is always the lookup key. However
-     * the docs provide no guarantee, so we just mark this case as likely.
-     */
-    if (likely(a->size == 0)) {
-        return ptr_cmp_tb_tc(a->ptr, b);
-    }
-    return ptr_cmp_tb_tc(b->ptr, a);
-}
-
-static void tcg_region_trees_init(void)
-{
-    size_t i;
-
-    tree_size = ROUND_UP(sizeof(struct tcg_region_tree), qemu_dcache_linesize);
-    region_trees = qemu_memalign(qemu_dcache_linesize, region.n * tree_size);
-    for (i = 0; i < region.n; i++) {
-        struct tcg_region_tree *rt = region_trees + i * tree_size;
-
-        qemu_mutex_init(&rt->lock);
-        rt->tree = g_tree_new(tb_tc_cmp);
-    }
-}
-
-static struct tcg_region_tree *tc_ptr_to_region_tree(const void *p)
-{
-    size_t region_idx;
-
-    /*
-     * Like tcg_splitwx_to_rw, with no assert.  The pc may come from
-     * a signal handler over which the caller has no control.
-     */
-    if (!in_code_gen_buffer(p)) {
-        p -= tcg_splitwx_diff;
-        if (!in_code_gen_buffer(p)) {
-            return NULL;
-        }
-    }
-
-    if (p < region.start_aligned) {
-        region_idx = 0;
-    } else {
-        ptrdiff_t offset = p - region.start_aligned;
-
-        if (offset > region.stride * (region.n - 1)) {
-            region_idx = region.n - 1;
-        } else {
-            region_idx = offset / region.stride;
-        }
-    }
-    return region_trees + region_idx * tree_size;
-}
-
-void tcg_tb_insert(TranslationBlock *tb)
-{
-    struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr);
-
-    g_assert(rt != NULL);
-    qemu_mutex_lock(&rt->lock);
-    g_tree_insert(rt->tree, &tb->tc, tb);
-    qemu_mutex_unlock(&rt->lock);
-}
-
-void tcg_tb_remove(TranslationBlock *tb)
-{
-    struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr);
-
-    g_assert(rt != NULL);
-    qemu_mutex_lock(&rt->lock);
-    g_tree_remove(rt->tree, &tb->tc);
-    qemu_mutex_unlock(&rt->lock);
-}
-
-/*
- * Find the TB 'tb' such that
- * tb->tc.ptr <= tc_ptr < tb->tc.ptr + tb->tc.size
- * Return NULL if not found.
- */
-TranslationBlock *tcg_tb_lookup(uintptr_t tc_ptr)
-{
-    struct tcg_region_tree *rt = tc_ptr_to_region_tree((void *)tc_ptr);
-    TranslationBlock *tb;
-    struct tb_tc s = { .ptr = (void *)tc_ptr };
-
-    if (rt == NULL) {
-        return NULL;
-    }
-
-    qemu_mutex_lock(&rt->lock);
-    tb = g_tree_lookup(rt->tree, &s);
-    qemu_mutex_unlock(&rt->lock);
-    return tb;
-}
-
-static void tcg_region_tree_lock_all(void)
-{
-    size_t i;
-
-    for (i = 0; i < region.n; i++) {
-        struct tcg_region_tree *rt = region_trees + i * tree_size;
-
-        qemu_mutex_lock(&rt->lock);
-    }
-}
-
-static void tcg_region_tree_unlock_all(void)
-{
-    size_t i;
-
-    for (i = 0; i < region.n; i++) {
-        struct tcg_region_tree *rt = region_trees + i * tree_size;
-
-        qemu_mutex_unlock(&rt->lock);
-    }
-}
-
-void tcg_tb_foreach(GTraverseFunc func, gpointer user_data)
-{
-    size_t i;
-
-    tcg_region_tree_lock_all();
-    for (i = 0; i < region.n; i++) {
-        struct tcg_region_tree *rt = region_trees + i * tree_size;
-
-        g_tree_foreach(rt->tree, func, user_data);
-    }
-    tcg_region_tree_unlock_all();
-}
-
-size_t tcg_nb_tbs(void)
-{
-    size_t nb_tbs = 0;
-    size_t i;
-
-    tcg_region_tree_lock_all();
-    for (i = 0; i < region.n; i++) {
-        struct tcg_region_tree *rt = region_trees + i * tree_size;
-
-        nb_tbs += g_tree_nnodes(rt->tree);
-    }
-    tcg_region_tree_unlock_all();
-    return nb_tbs;
-}
-
-static gboolean tcg_region_tree_traverse(gpointer k, gpointer v, gpointer data)
-{
-    TranslationBlock *tb = v;
-
-    tb_destroy(tb);
-    return FALSE;
-}
-
-static void tcg_region_tree_reset_all(void)
-{
-    size_t i;
-
-    tcg_region_tree_lock_all();
-    for (i = 0; i < region.n; i++) {
-        struct tcg_region_tree *rt = region_trees + i * tree_size;
-
-        g_tree_foreach(rt->tree, tcg_region_tree_traverse, NULL);
-        /* Increment the refcount first so that destroy acts as a reset */
-        g_tree_ref(rt->tree);
-        g_tree_destroy(rt->tree);
-    }
-    tcg_region_tree_unlock_all();
-}
-
-static void tcg_region_bounds(size_t curr_region, void **pstart, void **pend)
-{
-    void *start, *end;
-
-    start = region.start_aligned + curr_region * region.stride;
-    end = start + region.size;
-
-    if (curr_region == 0) {
-        start = region.start;
-    }
-    if (curr_region == region.n - 1) {
-        end = region.end;
-    }
-
-    *pstart = start;
-    *pend = end;
-}
-
-static void tcg_region_assign(TCGContext *s, size_t curr_region)
-{
-    void *start, *end;
-
-    tcg_region_bounds(curr_region, &start, &end);
-
-    s->code_gen_buffer = start;
-    s->code_gen_ptr = start;
-    s->code_gen_buffer_size = end - start;
-    s->code_gen_highwater = end - TCG_HIGHWATER;
-}
-
-static bool tcg_region_alloc__locked(TCGContext *s)
-{
-    if (region.current == region.n) {
-        return true;
-    }
-    tcg_region_assign(s, region.current);
-    region.current++;
-    return false;
-}
-
-/*
- * Request a new region once the one in use has filled up.
- * Returns true on error.
- */
-static bool tcg_region_alloc(TCGContext *s)
-{
-    bool err;
-    /* read the region size now; alloc__locked will overwrite it on success */
-    size_t size_full = s->code_gen_buffer_size;
-
-    qemu_mutex_lock(&region.lock);
-    err = tcg_region_alloc__locked(s);
-    if (!err) {
-        region.agg_size_full += size_full - TCG_HIGHWATER;
-    }
-    qemu_mutex_unlock(&region.lock);
-    return err;
-}
-
-/*
- * Perform a context's first region allocation.
- * This function does _not_ increment region.agg_size_full.
- */
-static void tcg_region_initial_alloc__locked(TCGContext *s)
-{
-    bool err = tcg_region_alloc__locked(s);
-    g_assert(!err);
-}
-
-#ifndef CONFIG_USER_ONLY
-static void tcg_region_initial_alloc(TCGContext *s)
-{
-    qemu_mutex_lock(&region.lock);
-    tcg_region_initial_alloc__locked(s);
-    qemu_mutex_unlock(&region.lock);
-}
-#endif
-
-/* Call from a safe-work context */
-void tcg_region_reset_all(void)
-{
-    unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
-    unsigned int i;
-
-    qemu_mutex_lock(&region.lock);
-    region.current = 0;
-    region.agg_size_full = 0;
-
-    for (i = 0; i < n_ctxs; i++) {
-        TCGContext *s = qatomic_read(&tcg_ctxs[i]);
-        tcg_region_initial_alloc__locked(s);
-    }
-    qemu_mutex_unlock(&region.lock);
-
-    tcg_region_tree_reset_all();
-}
-
-#ifdef CONFIG_USER_ONLY
-static size_t tcg_n_regions(void)
-{
-    return 1;
-}
-#else
-/*
- * It is likely that some vCPUs will translate more code than others, so we
- * first try to set more regions than max_cpus, with those regions being of
- * reasonable size. If that's not possible we make do by evenly dividing
- * the code_gen_buffer among the vCPUs.
- */
-static size_t tcg_n_regions(void)
-{
-    size_t i;
-
-    /* Use a single region if all we have is one vCPU thread */
-#if !defined(CONFIG_USER_ONLY)
-    MachineState *ms = MACHINE(qdev_get_machine());
-    unsigned int max_cpus = ms->smp.max_cpus;
-#endif
-    if (max_cpus == 1 || !qemu_tcg_mttcg_enabled()) {
-        return 1;
-    }
-
-    /* Try to have more regions than max_cpus, with each region being >= 2 MB */
-    for (i = 8; i > 0; i--) {
-        size_t regions_per_thread = i;
-        size_t region_size;
-
-        region_size = tcg_init_ctx.code_gen_buffer_size;
-        region_size /= max_cpus * regions_per_thread;
-
-        if (region_size >= 2 * 1024u * 1024) {
-            return max_cpus * regions_per_thread;
-        }
-    }
-    /* If we can't, then just allocate one region per vCPU thread */
-    return max_cpus;
-}
-#endif
-
-/*
- * Initializes region partitioning.
- *
- * Called at init time from the parent thread (i.e. the one calling
- * tcg_context_init), after the target's TCG globals have been set.
- *
- * Region partitioning works by splitting code_gen_buffer into separate regions,
- * and then assigning regions to TCG threads so that the threads can translate
- * code in parallel without synchronization.
- *
- * In softmmu the number of TCG threads is bounded by max_cpus, so we use at
- * least max_cpus regions in MTTCG. In !MTTCG we use a single region.
- * Note that the TCG options from the command-line (i.e. -accel accel=tcg,[...])
- * must have been parsed before calling this function, since it calls
- * qemu_tcg_mttcg_enabled().
- *
- * In user-mode we use a single region.  Having multiple regions in user-mode
- * is not supported, because the number of vCPU threads (recall that each thread
- * spawned by the guest corresponds to a vCPU thread) is only bounded by the
- * OS, and usually this number is huge (tens of thousands is not uncommon).
- * Thus, given this large bound on the number of vCPU threads and the fact
- * that code_gen_buffer is allocated at compile-time, we cannot guarantee
- * that the availability of at least one region per vCPU thread.
- *
- * However, this user-mode limitation is unlikely to be a significant problem
- * in practice. Multi-threaded guests share most if not all of their translated
- * code, which makes parallel code generation less appealing than in softmmu.
- */
-void tcg_region_init(void)
-{
-    void *buf = tcg_init_ctx.code_gen_buffer;
-    void *aligned;
-    size_t size = tcg_init_ctx.code_gen_buffer_size;
-    size_t page_size = qemu_real_host_page_size;
-    size_t region_size;
-    size_t n_regions;
-    size_t i;
-    uintptr_t splitwx_diff;
-
-    n_regions = tcg_n_regions();
-
-    /* The first region will be 'aligned - buf' bytes larger than the others */
-    aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
-    g_assert(aligned < tcg_init_ctx.code_gen_buffer + size);
-    /*
-     * Make region_size a multiple of page_size, using aligned as the start.
-     * As a result of this we might end up with a few extra pages at the end of
-     * the buffer; we will assign those to the last region.
-     */
-    region_size = (size - (aligned - buf)) / n_regions;
-    region_size = QEMU_ALIGN_DOWN(region_size, page_size);
-
-    /* A region must have at least 2 pages; one code, one guard */
-    g_assert(region_size >= 2 * page_size);
-
-    /* init the region struct */
-    qemu_mutex_init(&region.lock);
-    region.n = n_regions;
-    region.size = region_size - page_size;
-    region.stride = region_size;
-    region.start = buf;
-    region.start_aligned = aligned;
-    /* page-align the end, since its last page will be a guard page */
-    region.end = QEMU_ALIGN_PTR_DOWN(buf + size, page_size);
-    /* account for that last guard page */
-    region.end -= page_size;
-
-    /* set guard pages */
-    splitwx_diff = tcg_splitwx_diff;
-    for (i = 0; i < region.n; i++) {
-        void *start, *end;
-        int rc;
-
-        tcg_region_bounds(i, &start, &end);
-        rc = qemu_mprotect_none(end, page_size);
-        g_assert(!rc);
-        if (splitwx_diff) {
-            rc = qemu_mprotect_none(end + splitwx_diff, page_size);
-            g_assert(!rc);
-        }
-    }
-
-    tcg_region_trees_init();
-
-    /*
-     * Leave the initial context initialized to the first region.
-     * This will be the context into which we generate the prologue.
-     * It is also the only context for CONFIG_USER_ONLY.
-     */
-    tcg_region_initial_alloc__locked(&tcg_init_ctx);
-}
-
-static void tcg_region_prologue_set(TCGContext *s)
-{
-    /* Deduct the prologue from the first region.  */
-    g_assert(region.start == s->code_gen_buffer);
-    region.start = s->code_ptr;
-
-    /* Recompute boundaries of the first region. */
-    tcg_region_assign(s, 0);
-
-    /* Register the balance of the buffer with gdb. */
-    tcg_register_jit(tcg_splitwx_to_rx(region.start),
-                     region.end - region.start);
-}
-
 #ifdef CONFIG_DEBUG_TCG
 const void *tcg_splitwx_to_rx(void *rw)
 {
@@ -986,63 +504,6 @@ void tcg_register_thread(void)
 }
 #endif /* !CONFIG_USER_ONLY */
 
-/*
- * Returns the size (in bytes) of all translated code (i.e. from all regions)
- * currently in the cache.
- * See also: tcg_code_capacity()
- * Do not confuse with tcg_current_code_size(); that one applies to a single
- * TCG context.
- */
-size_t tcg_code_size(void)
-{
-    unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
-    unsigned int i;
-    size_t total;
-
-    qemu_mutex_lock(&region.lock);
-    total = region.agg_size_full;
-    for (i = 0; i < n_ctxs; i++) {
-        const TCGContext *s = qatomic_read(&tcg_ctxs[i]);
-        size_t size;
-
-        size = qatomic_read(&s->code_gen_ptr) - s->code_gen_buffer;
-        g_assert(size <= s->code_gen_buffer_size);
-        total += size;
-    }
-    qemu_mutex_unlock(&region.lock);
-    return total;
-}
-
-/*
- * Returns the code capacity (in bytes) of the entire cache, i.e. including all
- * regions.
- * See also: tcg_code_size()
- */
-size_t tcg_code_capacity(void)
-{
-    size_t guard_size, capacity;
-
-    /* no need for synchronization; these variables are set at init time */
-    guard_size = region.stride - region.size;
-    capacity = region.end + guard_size - region.start;
-    capacity -= region.n * (guard_size + TCG_HIGHWATER);
-    return capacity;
-}
-
-size_t tcg_tb_phys_invalidate_count(void)
-{
-    unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
-    unsigned int i;
-    size_t total = 0;
-
-    for (i = 0; i < n_ctxs; i++) {
-        const TCGContext *s = qatomic_read(&tcg_ctxs[i]);
-
-        total += qatomic_read(&s->tb_phys_invalidate_count);
-    }
-    return total;
-}
-
 /* pool based memory allocation */
 void *tcg_malloc_internal(TCGContext *s, int size)
 {
diff --git a/tcg/meson.build b/tcg/meson.build
index 84064a341e..5be3915529 100644
--- a/tcg/meson.build
+++ b/tcg/meson.build
@@ -2,6 +2,7 @@ tcg_ss = ss.source_set()
 
 tcg_ss.add(files(
   'optimize.c',
+  'region.c',
   'tcg.c',
   'tcg-common.c',
   'tcg-op.c',
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 08/29] accel/tcg: Inline cpu_gen_init
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (6 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 07/29] tcg: Split out region.c Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 09/29] accel/tcg: Move alloc_code_gen_buffer to tcg/region.c Richard Henderson
                   ` (22 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé

It consists of one function call and has only one caller.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/translate-all.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index b9057567f4..6d3184e7da 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -245,11 +245,6 @@ static void page_table_config_init(void)
     assert(v_l2_levels >= 0);
 }
 
-static void cpu_gen_init(void)
-{
-    tcg_context_init(&tcg_init_ctx);
-}
-
 /* Encode VAL as a signed leb128 sequence at P.
    Return P incremented past the encoded value.  */
 static uint8_t *encode_sleb128(uint8_t *p, target_long val)
@@ -1331,7 +1326,7 @@ void tcg_exec_init(unsigned long tb_size, int splitwx)
     bool ok;
 
     tcg_allowed = true;
-    cpu_gen_init();
+    tcg_context_init(&tcg_init_ctx);
     page_init();
     tb_htable_init();
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 09/29] accel/tcg: Move alloc_code_gen_buffer to tcg/region.c
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (7 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 08/29] accel/tcg: Inline cpu_gen_init Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 10/29] accel/tcg: Rename tcg_init to tcg_init_machine Richard Henderson
                   ` (21 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Buffer management is integral to tcg.  Do not leave the allocation
to code outside of tcg/.  This is code movement, with further
cleanups to follow.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/tcg/tcg.h         |   2 +-
 accel/tcg/translate-all.c | 414 +------------------------------------
 tcg/region.c              | 421 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 418 insertions(+), 419 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 0f0695e90d..7a435bf807 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -874,7 +874,7 @@ void *tcg_malloc_internal(TCGContext *s, int size);
 void tcg_pool_reset(TCGContext *s);
 TranslationBlock *tcg_tb_alloc(TCGContext *s);
 
-void tcg_region_init(void);
+void tcg_region_init(size_t tb_size, int splitwx);
 void tb_destroy(TranslationBlock *tb);
 void tcg_region_reset_all(void);
 
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 6d3184e7da..4071edda16 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -18,7 +18,6 @@
  */
 
 #include "qemu/osdep.h"
-#include "qemu/units.h"
 #include "qemu-common.h"
 
 #define NO_CPU_IO_DEFS
@@ -51,7 +50,6 @@
 #include "exec/tb-hash.h"
 #include "exec/translate-all.h"
 #include "qemu/bitmap.h"
-#include "qemu/error-report.h"
 #include "qemu/qemu-print.h"
 #include "qemu/timer.h"
 #include "qemu/main-loop.h"
@@ -895,408 +893,6 @@ static void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1,
     }
 }
 
-/* Minimum size of the code gen buffer.  This number is randomly chosen,
-   but not so small that we can't have a fair number of TB's live.  */
-#define MIN_CODE_GEN_BUFFER_SIZE     (1 * MiB)
-
-/* Maximum size of the code gen buffer we'd like to use.  Unless otherwise
-   indicated, this is constrained by the range of direct branches on the
-   host cpu, as used by the TCG implementation of goto_tb.  */
-#if defined(__x86_64__)
-# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
-#elif defined(__sparc__)
-# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
-#elif defined(__powerpc64__)
-# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
-#elif defined(__powerpc__)
-# define MAX_CODE_GEN_BUFFER_SIZE  (32 * MiB)
-#elif defined(__aarch64__)
-# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
-#elif defined(__s390x__)
-  /* We have a +- 4GB range on the branches; leave some slop.  */
-# define MAX_CODE_GEN_BUFFER_SIZE  (3 * GiB)
-#elif defined(__mips__)
-  /* We have a 256MB branch region, but leave room to make sure the
-     main executable is also within that region.  */
-# define MAX_CODE_GEN_BUFFER_SIZE  (128 * MiB)
-#else
-# define MAX_CODE_GEN_BUFFER_SIZE  ((size_t)-1)
-#endif
-
-#if TCG_TARGET_REG_BITS == 32
-#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (32 * MiB)
-#ifdef CONFIG_USER_ONLY
-/*
- * For user mode on smaller 32 bit systems we may run into trouble
- * allocating big chunks of data in the right place. On these systems
- * we utilise a static code generation buffer directly in the binary.
- */
-#define USE_STATIC_CODE_GEN_BUFFER
-#endif
-#else /* TCG_TARGET_REG_BITS == 64 */
-#ifdef CONFIG_USER_ONLY
-/*
- * As user-mode emulation typically means running multiple instances
- * of the translator don't go too nuts with our default code gen
- * buffer lest we make things too hard for the OS.
- */
-#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (128 * MiB)
-#else
-/*
- * We expect most system emulation to run one or two guests per host.
- * Users running large scale system emulation may want to tweak their
- * runtime setup via the tb-size control on the command line.
- */
-#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB)
-#endif
-#endif
-
-#define DEFAULT_CODE_GEN_BUFFER_SIZE \
-  (DEFAULT_CODE_GEN_BUFFER_SIZE_1 < MAX_CODE_GEN_BUFFER_SIZE \
-   ? DEFAULT_CODE_GEN_BUFFER_SIZE_1 : MAX_CODE_GEN_BUFFER_SIZE)
-
-static size_t size_code_gen_buffer(size_t tb_size)
-{
-    /* Size the buffer.  */
-    if (tb_size == 0) {
-        size_t phys_mem = qemu_get_host_physmem();
-        if (phys_mem == 0) {
-            tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
-        } else {
-            tb_size = MIN(DEFAULT_CODE_GEN_BUFFER_SIZE, phys_mem / 8);
-        }
-    }
-    if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
-        tb_size = MIN_CODE_GEN_BUFFER_SIZE;
-    }
-    if (tb_size > MAX_CODE_GEN_BUFFER_SIZE) {
-        tb_size = MAX_CODE_GEN_BUFFER_SIZE;
-    }
-    return tb_size;
-}
-
-#ifdef __mips__
-/* In order to use J and JAL within the code_gen_buffer, we require
-   that the buffer not cross a 256MB boundary.  */
-static inline bool cross_256mb(void *addr, size_t size)
-{
-    return ((uintptr_t)addr ^ ((uintptr_t)addr + size)) & ~0x0ffffffful;
-}
-
-/* We weren't able to allocate a buffer without crossing that boundary,
-   so make do with the larger portion of the buffer that doesn't cross.
-   Returns the new base of the buffer, and adjusts code_gen_buffer_size.  */
-static inline void *split_cross_256mb(void *buf1, size_t size1)
-{
-    void *buf2 = (void *)(((uintptr_t)buf1 + size1) & ~0x0ffffffful);
-    size_t size2 = buf1 + size1 - buf2;
-
-    size1 = buf2 - buf1;
-    if (size1 < size2) {
-        size1 = size2;
-        buf1 = buf2;
-    }
-
-    tcg_ctx->code_gen_buffer_size = size1;
-    return buf1;
-}
-#endif
-
-#ifdef USE_STATIC_CODE_GEN_BUFFER
-static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
-    __attribute__((aligned(CODE_GEN_ALIGN)));
-
-static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
-{
-    void *buf, *end;
-    size_t size;
-
-    if (splitwx > 0) {
-        error_setg(errp, "jit split-wx not supported");
-        return false;
-    }
-
-    /* page-align the beginning and end of the buffer */
-    buf = static_code_gen_buffer;
-    end = static_code_gen_buffer + sizeof(static_code_gen_buffer);
-    buf = QEMU_ALIGN_PTR_UP(buf, qemu_real_host_page_size);
-    end = QEMU_ALIGN_PTR_DOWN(end, qemu_real_host_page_size);
-
-    size = end - buf;
-
-    /* Honor a command-line option limiting the size of the buffer.  */
-    if (size > tb_size) {
-        size = QEMU_ALIGN_DOWN(tb_size, qemu_real_host_page_size);
-    }
-    tcg_ctx->code_gen_buffer_size = size;
-
-#ifdef __mips__
-    if (cross_256mb(buf, size)) {
-        buf = split_cross_256mb(buf, size);
-        size = tcg_ctx->code_gen_buffer_size;
-    }
-#endif
-
-    if (qemu_mprotect_rwx(buf, size)) {
-        error_setg_errno(errp, errno, "mprotect of jit buffer");
-        return false;
-    }
-    qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
-
-    tcg_ctx->code_gen_buffer = buf;
-    return true;
-}
-#elif defined(_WIN32)
-static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
-{
-    void *buf;
-
-    if (splitwx > 0) {
-        error_setg(errp, "jit split-wx not supported");
-        return false;
-    }
-
-    buf = VirtualAlloc(NULL, size, MEM_RESERVE | MEM_COMMIT,
-                             PAGE_EXECUTE_READWRITE);
-    if (buf == NULL) {
-        error_setg_win32(errp, GetLastError(),
-                         "allocate %zu bytes for jit buffer", size);
-        return false;
-    }
-
-    tcg_ctx->code_gen_buffer = buf;
-    tcg_ctx->code_gen_buffer_size = size;
-    return true;
-}
-#else
-static bool alloc_code_gen_buffer_anon(size_t size, int prot,
-                                       int flags, Error **errp)
-{
-    void *buf;
-
-    buf = mmap(NULL, size, prot, flags, -1, 0);
-    if (buf == MAP_FAILED) {
-        error_setg_errno(errp, errno,
-                         "allocate %zu bytes for jit buffer", size);
-        return false;
-    }
-    tcg_ctx->code_gen_buffer_size = size;
-
-#ifdef __mips__
-    if (cross_256mb(buf, size)) {
-        /*
-         * Try again, with the original still mapped, to avoid re-acquiring
-         * the same 256mb crossing.
-         */
-        size_t size2;
-        void *buf2 = mmap(NULL, size, prot, flags, -1, 0);
-        switch ((int)(buf2 != MAP_FAILED)) {
-        case 1:
-            if (!cross_256mb(buf2, size)) {
-                /* Success!  Use the new buffer.  */
-                munmap(buf, size);
-                break;
-            }
-            /* Failure.  Work with what we had.  */
-            munmap(buf2, size);
-            /* fallthru */
-        default:
-            /* Split the original buffer.  Free the smaller half.  */
-            buf2 = split_cross_256mb(buf, size);
-            size2 = tcg_ctx->code_gen_buffer_size;
-            if (buf == buf2) {
-                munmap(buf + size2, size - size2);
-            } else {
-                munmap(buf, size - size2);
-            }
-            size = size2;
-            break;
-        }
-        buf = buf2;
-    }
-#endif
-
-    /* Request large pages for the buffer.  */
-    qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
-
-    tcg_ctx->code_gen_buffer = buf;
-    return true;
-}
-
-#ifndef CONFIG_TCG_INTERPRETER
-#ifdef CONFIG_POSIX
-#include "qemu/memfd.h"
-
-static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
-{
-    void *buf_rw = NULL, *buf_rx = MAP_FAILED;
-    int fd = -1;
-
-#ifdef __mips__
-    /* Find space for the RX mapping, vs the 256MiB regions. */
-    if (!alloc_code_gen_buffer_anon(size, PROT_NONE,
-                                    MAP_PRIVATE | MAP_ANONYMOUS |
-                                    MAP_NORESERVE, errp)) {
-        return false;
-    }
-    /* The size of the mapping may have been adjusted. */
-    size = tcg_ctx->code_gen_buffer_size;
-    buf_rx = tcg_ctx->code_gen_buffer;
-#endif
-
-    buf_rw = qemu_memfd_alloc("tcg-jit", size, 0, &fd, errp);
-    if (buf_rw == NULL) {
-        goto fail;
-    }
-
-#ifdef __mips__
-    void *tmp = mmap(buf_rx, size, PROT_READ | PROT_EXEC,
-                     MAP_SHARED | MAP_FIXED, fd, 0);
-    if (tmp != buf_rx) {
-        goto fail_rx;
-    }
-#else
-    buf_rx = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
-    if (buf_rx == MAP_FAILED) {
-        goto fail_rx;
-    }
-#endif
-
-    close(fd);
-    tcg_ctx->code_gen_buffer = buf_rw;
-    tcg_ctx->code_gen_buffer_size = size;
-    tcg_splitwx_diff = buf_rx - buf_rw;
-
-    /* Request large pages for the buffer and the splitwx.  */
-    qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
-    qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
-    return true;
-
- fail_rx:
-    error_setg_errno(errp, errno, "failed to map shared memory for execute");
- fail:
-    if (buf_rx != MAP_FAILED) {
-        munmap(buf_rx, size);
-    }
-    if (buf_rw) {
-        munmap(buf_rw, size);
-    }
-    if (fd >= 0) {
-        close(fd);
-    }
-    return false;
-}
-#endif /* CONFIG_POSIX */
-
-#ifdef CONFIG_DARWIN
-#include <mach/mach.h>
-
-extern kern_return_t mach_vm_remap(vm_map_t target_task,
-                                   mach_vm_address_t *target_address,
-                                   mach_vm_size_t size,
-                                   mach_vm_offset_t mask,
-                                   int flags,
-                                   vm_map_t src_task,
-                                   mach_vm_address_t src_address,
-                                   boolean_t copy,
-                                   vm_prot_t *cur_protection,
-                                   vm_prot_t *max_protection,
-                                   vm_inherit_t inheritance);
-
-static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
-{
-    kern_return_t ret;
-    mach_vm_address_t buf_rw, buf_rx;
-    vm_prot_t cur_prot, max_prot;
-
-    /* Map the read-write portion via normal anon memory. */
-    if (!alloc_code_gen_buffer_anon(size, PROT_READ | PROT_WRITE,
-                                    MAP_PRIVATE | MAP_ANONYMOUS, errp)) {
-        return false;
-    }
-
-    buf_rw = (mach_vm_address_t)tcg_ctx->code_gen_buffer;
-    buf_rx = 0;
-    ret = mach_vm_remap(mach_task_self(),
-                        &buf_rx,
-                        size,
-                        0,
-                        VM_FLAGS_ANYWHERE,
-                        mach_task_self(),
-                        buf_rw,
-                        false,
-                        &cur_prot,
-                        &max_prot,
-                        VM_INHERIT_NONE);
-    if (ret != KERN_SUCCESS) {
-        /* TODO: Convert "ret" to a human readable error message. */
-        error_setg(errp, "vm_remap for jit splitwx failed");
-        munmap((void *)buf_rw, size);
-        return false;
-    }
-
-    if (mprotect((void *)buf_rx, size, PROT_READ | PROT_EXEC) != 0) {
-        error_setg_errno(errp, errno, "mprotect for jit splitwx");
-        munmap((void *)buf_rx, size);
-        munmap((void *)buf_rw, size);
-        return false;
-    }
-
-    tcg_splitwx_diff = buf_rx - buf_rw;
-    return true;
-}
-#endif /* CONFIG_DARWIN */
-#endif /* CONFIG_TCG_INTERPRETER */
-
-static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
-{
-#ifndef CONFIG_TCG_INTERPRETER
-# ifdef CONFIG_DARWIN
-    return alloc_code_gen_buffer_splitwx_vmremap(size, errp);
-# endif
-# ifdef CONFIG_POSIX
-    return alloc_code_gen_buffer_splitwx_memfd(size, errp);
-# endif
-#endif
-    error_setg(errp, "jit split-wx not supported");
-    return false;
-}
-
-static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
-{
-    ERRP_GUARD();
-    int prot, flags;
-
-    if (splitwx) {
-        if (alloc_code_gen_buffer_splitwx(size, errp)) {
-            return true;
-        }
-        /*
-         * If splitwx force-on (1), fail;
-         * if splitwx default-on (-1), fall through to splitwx off.
-         */
-        if (splitwx > 0) {
-            return false;
-        }
-        error_free_or_abort(errp);
-    }
-
-    prot = PROT_READ | PROT_WRITE | PROT_EXEC;
-    flags = MAP_PRIVATE | MAP_ANONYMOUS;
-#ifdef CONFIG_TCG_INTERPRETER
-    /* The tcg interpreter does not need execute permission. */
-    prot = PROT_READ | PROT_WRITE;
-#elif defined(CONFIG_DARWIN)
-    /* Applicable to both iOS and macOS (Apple Silicon). */
-    if (!splitwx) {
-        flags |= MAP_JIT;
-    }
-#endif
-
-    return alloc_code_gen_buffer_anon(size, prot, flags, errp);
-}
-#endif /* USE_STATIC_CODE_GEN_BUFFER, WIN32, POSIX */
-
 static bool tb_cmp(const void *ap, const void *bp)
 {
     const TranslationBlock *a = ap;
@@ -1323,19 +919,11 @@ static void tb_htable_init(void)
    size. */
 void tcg_exec_init(unsigned long tb_size, int splitwx)
 {
-    bool ok;
-
     tcg_allowed = true;
     tcg_context_init(&tcg_init_ctx);
     page_init();
     tb_htable_init();
-
-    ok = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
-                               splitwx, &error_fatal);
-    assert(ok);
-
-    /* TODO: allocating regions is hand-in-glove with code_gen_buffer. */
-    tcg_region_init();
+    tcg_region_init(tb_size, splitwx);
 
 #if defined(CONFIG_SOFTMMU)
     /* There's no guest base to take into account, so go ahead and
diff --git a/tcg/region.c b/tcg/region.c
index af45a0174e..8d88144a22 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -23,6 +23,8 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
 #include "exec/exec-all.h"
 #include "tcg/tcg.h"
 #if !defined(CONFIG_USER_ONLY)
@@ -406,6 +408,408 @@ static size_t tcg_n_regions(void)
 }
 #endif
 
+/* Minimum size of the code gen buffer.  This number is randomly chosen,
+   but not so small that we can't have a fair number of TB's live.  */
+#define MIN_CODE_GEN_BUFFER_SIZE     (1 * MiB)
+
+/* Maximum size of the code gen buffer we'd like to use.  Unless otherwise
+   indicated, this is constrained by the range of direct branches on the
+   host cpu, as used by the TCG implementation of goto_tb.  */
+#if defined(__x86_64__)
+# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
+#elif defined(__sparc__)
+# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
+#elif defined(__powerpc64__)
+# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
+#elif defined(__powerpc__)
+# define MAX_CODE_GEN_BUFFER_SIZE  (32 * MiB)
+#elif defined(__aarch64__)
+# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
+#elif defined(__s390x__)
+  /* We have a +- 4GB range on the branches; leave some slop.  */
+# define MAX_CODE_GEN_BUFFER_SIZE  (3 * GiB)
+#elif defined(__mips__)
+  /* We have a 256MB branch region, but leave room to make sure the
+     main executable is also within that region.  */
+# define MAX_CODE_GEN_BUFFER_SIZE  (128 * MiB)
+#else
+# define MAX_CODE_GEN_BUFFER_SIZE  ((size_t)-1)
+#endif
+
+#if TCG_TARGET_REG_BITS == 32
+#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (32 * MiB)
+#ifdef CONFIG_USER_ONLY
+/*
+ * For user mode on smaller 32 bit systems we may run into trouble
+ * allocating big chunks of data in the right place. On these systems
+ * we utilise a static code generation buffer directly in the binary.
+ */
+#define USE_STATIC_CODE_GEN_BUFFER
+#endif
+#else /* TCG_TARGET_REG_BITS == 64 */
+#ifdef CONFIG_USER_ONLY
+/*
+ * As user-mode emulation typically means running multiple instances
+ * of the translator don't go too nuts with our default code gen
+ * buffer lest we make things too hard for the OS.
+ */
+#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (128 * MiB)
+#else
+/*
+ * We expect most system emulation to run one or two guests per host.
+ * Users running large scale system emulation may want to tweak their
+ * runtime setup via the tb-size control on the command line.
+ */
+#define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (1 * GiB)
+#endif
+#endif
+
+#define DEFAULT_CODE_GEN_BUFFER_SIZE \
+  (DEFAULT_CODE_GEN_BUFFER_SIZE_1 < MAX_CODE_GEN_BUFFER_SIZE \
+   ? DEFAULT_CODE_GEN_BUFFER_SIZE_1 : MAX_CODE_GEN_BUFFER_SIZE)
+
+static size_t size_code_gen_buffer(size_t tb_size)
+{
+    /* Size the buffer.  */
+    if (tb_size == 0) {
+        size_t phys_mem = qemu_get_host_physmem();
+        if (phys_mem == 0) {
+            tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
+        } else {
+            tb_size = MIN(DEFAULT_CODE_GEN_BUFFER_SIZE, phys_mem / 8);
+        }
+    }
+    if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
+        tb_size = MIN_CODE_GEN_BUFFER_SIZE;
+    }
+    if (tb_size > MAX_CODE_GEN_BUFFER_SIZE) {
+        tb_size = MAX_CODE_GEN_BUFFER_SIZE;
+    }
+    return tb_size;
+}
+
+#ifdef __mips__
+/* In order to use J and JAL within the code_gen_buffer, we require
+   that the buffer not cross a 256MB boundary.  */
+static inline bool cross_256mb(void *addr, size_t size)
+{
+    return ((uintptr_t)addr ^ ((uintptr_t)addr + size)) & ~0x0ffffffful;
+}
+
+/* We weren't able to allocate a buffer without crossing that boundary,
+   so make do with the larger portion of the buffer that doesn't cross.
+   Returns the new base of the buffer, and adjusts code_gen_buffer_size.  */
+static inline void *split_cross_256mb(void *buf1, size_t size1)
+{
+    void *buf2 = (void *)(((uintptr_t)buf1 + size1) & ~0x0ffffffful);
+    size_t size2 = buf1 + size1 - buf2;
+
+    size1 = buf2 - buf1;
+    if (size1 < size2) {
+        size1 = size2;
+        buf1 = buf2;
+    }
+
+    tcg_ctx->code_gen_buffer_size = size1;
+    return buf1;
+}
+#endif
+
+#ifdef USE_STATIC_CODE_GEN_BUFFER
+static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
+    __attribute__((aligned(CODE_GEN_ALIGN)));
+
+static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
+{
+    void *buf, *end;
+    size_t size;
+
+    if (splitwx > 0) {
+        error_setg(errp, "jit split-wx not supported");
+        return false;
+    }
+
+    /* page-align the beginning and end of the buffer */
+    buf = static_code_gen_buffer;
+    end = static_code_gen_buffer + sizeof(static_code_gen_buffer);
+    buf = QEMU_ALIGN_PTR_UP(buf, qemu_real_host_page_size);
+    end = QEMU_ALIGN_PTR_DOWN(end, qemu_real_host_page_size);
+
+    size = end - buf;
+
+    /* Honor a command-line option limiting the size of the buffer.  */
+    if (size > tb_size) {
+        size = QEMU_ALIGN_DOWN(tb_size, qemu_real_host_page_size);
+    }
+    tcg_ctx->code_gen_buffer_size = size;
+
+#ifdef __mips__
+    if (cross_256mb(buf, size)) {
+        buf = split_cross_256mb(buf, size);
+        size = tcg_ctx->code_gen_buffer_size;
+    }
+#endif
+
+    if (qemu_mprotect_rwx(buf, size)) {
+        error_setg_errno(errp, errno, "mprotect of jit buffer");
+        return false;
+    }
+    qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
+
+    tcg_ctx->code_gen_buffer = buf;
+    return true;
+}
+#elif defined(_WIN32)
+static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
+{
+    void *buf;
+
+    if (splitwx > 0) {
+        error_setg(errp, "jit split-wx not supported");
+        return false;
+    }
+
+    buf = VirtualAlloc(NULL, size, MEM_RESERVE | MEM_COMMIT,
+                             PAGE_EXECUTE_READWRITE);
+    if (buf == NULL) {
+        error_setg_win32(errp, GetLastError(),
+                         "allocate %zu bytes for jit buffer", size);
+        return false;
+    }
+
+    tcg_ctx->code_gen_buffer = buf;
+    tcg_ctx->code_gen_buffer_size = size;
+    return true;
+}
+#else
+static bool alloc_code_gen_buffer_anon(size_t size, int prot,
+                                       int flags, Error **errp)
+{
+    void *buf;
+
+    buf = mmap(NULL, size, prot, flags, -1, 0);
+    if (buf == MAP_FAILED) {
+        error_setg_errno(errp, errno,
+                         "allocate %zu bytes for jit buffer", size);
+        return false;
+    }
+    tcg_ctx->code_gen_buffer_size = size;
+
+#ifdef __mips__
+    if (cross_256mb(buf, size)) {
+        /*
+         * Try again, with the original still mapped, to avoid re-acquiring
+         * the same 256mb crossing.
+         */
+        size_t size2;
+        void *buf2 = mmap(NULL, size, prot, flags, -1, 0);
+        switch ((int)(buf2 != MAP_FAILED)) {
+        case 1:
+            if (!cross_256mb(buf2, size)) {
+                /* Success!  Use the new buffer.  */
+                munmap(buf, size);
+                break;
+            }
+            /* Failure.  Work with what we had.  */
+            munmap(buf2, size);
+            /* fallthru */
+        default:
+            /* Split the original buffer.  Free the smaller half.  */
+            buf2 = split_cross_256mb(buf, size);
+            size2 = tcg_ctx->code_gen_buffer_size;
+            if (buf == buf2) {
+                munmap(buf + size2, size - size2);
+            } else {
+                munmap(buf, size - size2);
+            }
+            size = size2;
+            break;
+        }
+        buf = buf2;
+    }
+#endif
+
+    /* Request large pages for the buffer.  */
+    qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
+
+    tcg_ctx->code_gen_buffer = buf;
+    return true;
+}
+
+#ifndef CONFIG_TCG_INTERPRETER
+#ifdef CONFIG_POSIX
+#include "qemu/memfd.h"
+
+static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
+{
+    void *buf_rw = NULL, *buf_rx = MAP_FAILED;
+    int fd = -1;
+
+#ifdef __mips__
+    /* Find space for the RX mapping, vs the 256MiB regions. */
+    if (!alloc_code_gen_buffer_anon(size, PROT_NONE,
+                                    MAP_PRIVATE | MAP_ANONYMOUS |
+                                    MAP_NORESERVE, errp)) {
+        return false;
+    }
+    /* The size of the mapping may have been adjusted. */
+    size = tcg_ctx->code_gen_buffer_size;
+    buf_rx = tcg_ctx->code_gen_buffer;
+#endif
+
+    buf_rw = qemu_memfd_alloc("tcg-jit", size, 0, &fd, errp);
+    if (buf_rw == NULL) {
+        goto fail;
+    }
+
+#ifdef __mips__
+    void *tmp = mmap(buf_rx, size, PROT_READ | PROT_EXEC,
+                     MAP_SHARED | MAP_FIXED, fd, 0);
+    if (tmp != buf_rx) {
+        goto fail_rx;
+    }
+#else
+    buf_rx = mmap(NULL, size, PROT_READ | PROT_EXEC, MAP_SHARED, fd, 0);
+    if (buf_rx == MAP_FAILED) {
+        goto fail_rx;
+    }
+#endif
+
+    close(fd);
+    tcg_ctx->code_gen_buffer = buf_rw;
+    tcg_ctx->code_gen_buffer_size = size;
+    tcg_splitwx_diff = buf_rx - buf_rw;
+
+    /* Request large pages for the buffer and the splitwx.  */
+    qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
+    qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
+    return true;
+
+ fail_rx:
+    error_setg_errno(errp, errno, "failed to map shared memory for execute");
+ fail:
+    if (buf_rx != MAP_FAILED) {
+        munmap(buf_rx, size);
+    }
+    if (buf_rw) {
+        munmap(buf_rw, size);
+    }
+    if (fd >= 0) {
+        close(fd);
+    }
+    return false;
+}
+#endif /* CONFIG_POSIX */
+
+#ifdef CONFIG_DARWIN
+#include <mach/mach.h>
+
+extern kern_return_t mach_vm_remap(vm_map_t target_task,
+                                   mach_vm_address_t *target_address,
+                                   mach_vm_size_t size,
+                                   mach_vm_offset_t mask,
+                                   int flags,
+                                   vm_map_t src_task,
+                                   mach_vm_address_t src_address,
+                                   boolean_t copy,
+                                   vm_prot_t *cur_protection,
+                                   vm_prot_t *max_protection,
+                                   vm_inherit_t inheritance);
+
+static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
+{
+    kern_return_t ret;
+    mach_vm_address_t buf_rw, buf_rx;
+    vm_prot_t cur_prot, max_prot;
+
+    /* Map the read-write portion via normal anon memory. */
+    if (!alloc_code_gen_buffer_anon(size, PROT_READ | PROT_WRITE,
+                                    MAP_PRIVATE | MAP_ANONYMOUS, errp)) {
+        return false;
+    }
+
+    buf_rw = (mach_vm_address_t)tcg_ctx->code_gen_buffer;
+    buf_rx = 0;
+    ret = mach_vm_remap(mach_task_self(),
+                        &buf_rx,
+                        size,
+                        0,
+                        VM_FLAGS_ANYWHERE,
+                        mach_task_self(),
+                        buf_rw,
+                        false,
+                        &cur_prot,
+                        &max_prot,
+                        VM_INHERIT_NONE);
+    if (ret != KERN_SUCCESS) {
+        /* TODO: Convert "ret" to a human readable error message. */
+        error_setg(errp, "vm_remap for jit splitwx failed");
+        munmap((void *)buf_rw, size);
+        return false;
+    }
+
+    if (mprotect((void *)buf_rx, size, PROT_READ | PROT_EXEC) != 0) {
+        error_setg_errno(errp, errno, "mprotect for jit splitwx");
+        munmap((void *)buf_rx, size);
+        munmap((void *)buf_rw, size);
+        return false;
+    }
+
+    tcg_splitwx_diff = buf_rx - buf_rw;
+    return true;
+}
+#endif /* CONFIG_DARWIN */
+#endif /* CONFIG_TCG_INTERPRETER */
+
+static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
+{
+#ifndef CONFIG_TCG_INTERPRETER
+# ifdef CONFIG_DARWIN
+    return alloc_code_gen_buffer_splitwx_vmremap(size, errp);
+# endif
+# ifdef CONFIG_POSIX
+    return alloc_code_gen_buffer_splitwx_memfd(size, errp);
+# endif
+#endif
+    error_setg(errp, "jit split-wx not supported");
+    return false;
+}
+
+static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
+{
+    ERRP_GUARD();
+    int prot, flags;
+
+    if (splitwx) {
+        if (alloc_code_gen_buffer_splitwx(size, errp)) {
+            return true;
+        }
+        /*
+         * If splitwx force-on (1), fail;
+         * if splitwx default-on (-1), fall through to splitwx off.
+         */
+        if (splitwx > 0) {
+            return false;
+        }
+        error_free_or_abort(errp);
+    }
+
+    prot = PROT_READ | PROT_WRITE | PROT_EXEC;
+    flags = MAP_PRIVATE | MAP_ANONYMOUS;
+#ifdef CONFIG_TCG_INTERPRETER
+    /* The tcg interpreter does not need execute permission. */
+    prot = PROT_READ | PROT_WRITE;
+#elif defined(CONFIG_DARWIN)
+    /* Applicable to both iOS and macOS (Apple Silicon). */
+    if (!splitwx) {
+        flags |= MAP_JIT;
+    }
+#endif
+
+    return alloc_code_gen_buffer_anon(size, prot, flags, errp);
+}
+#endif /* USE_STATIC_CODE_GEN_BUFFER, WIN32, POSIX */
+
 /*
  * Initializes region partitioning.
  *
@@ -434,17 +838,24 @@ static size_t tcg_n_regions(void)
  * in practice. Multi-threaded guests share most if not all of their translated
  * code, which makes parallel code generation less appealing than in softmmu.
  */
-void tcg_region_init(void)
+void tcg_region_init(size_t tb_size, int splitwx)
 {
-    void *buf = tcg_init_ctx.code_gen_buffer;
-    void *aligned;
-    size_t size = tcg_init_ctx.code_gen_buffer_size;
-    size_t page_size = qemu_real_host_page_size;
+    void *buf, *aligned;
+    size_t size;
+    size_t page_size;
     size_t region_size;
     size_t n_regions;
     size_t i;
     uintptr_t splitwx_diff;
+    bool ok;
 
+    ok = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
+                               splitwx, &error_fatal);
+    assert(ok);
+
+    buf = tcg_init_ctx.code_gen_buffer;
+    size = tcg_init_ctx.code_gen_buffer_size;
+    page_size = qemu_real_host_page_size;
     n_regions = tcg_n_regions();
 
     /* The first region will be 'aligned - buf' bytes larger than the others */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 10/29] accel/tcg: Rename tcg_init to tcg_init_machine
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (8 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 09/29] accel/tcg: Move alloc_code_gen_buffer to tcg/region.c Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 11/29] tcg: Create tcg_init Richard Henderson
                   ` (20 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé

We shortly want to use tcg_init for something else.
Since the hook is called init_machine, match that.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/tcg-all.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index f132033999..30d81ff7f5 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -105,7 +105,7 @@ static void tcg_accel_instance_init(Object *obj)
 
 bool mttcg_enabled;
 
-static int tcg_init(MachineState *ms)
+static int tcg_init_machine(MachineState *ms)
 {
     TCGState *s = TCG_STATE(current_accel());
 
@@ -189,7 +189,7 @@ static void tcg_accel_class_init(ObjectClass *oc, void *data)
 {
     AccelClass *ac = ACCEL_CLASS(oc);
     ac->name = "tcg";
-    ac->init_machine = tcg_init;
+    ac->init_machine = tcg_init_machine;
     ac->allowed = &tcg_allowed;
 
     object_class_property_add_str(oc, "thread",
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 11/29] tcg: Create tcg_init
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (9 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 10/29] accel/tcg: Rename tcg_init to tcg_init_machine Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 12/29] accel/tcg: Merge tcg_exec_init into tcg_init_machine Richard Henderson
                   ` (19 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Perform both tcg_context_init and tcg_region_init.
Do not leave this split to the caller.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/tcg/tcg.h         | 3 +--
 tcg/internal.h            | 1 +
 accel/tcg/translate-all.c | 3 +--
 tcg/tcg.c                 | 9 ++++++++-
 4 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 7a435bf807..3ad77ec34d 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -874,7 +874,6 @@ void *tcg_malloc_internal(TCGContext *s, int size);
 void tcg_pool_reset(TCGContext *s);
 TranslationBlock *tcg_tb_alloc(TCGContext *s);
 
-void tcg_region_init(size_t tb_size, int splitwx);
 void tb_destroy(TranslationBlock *tb);
 void tcg_region_reset_all(void);
 
@@ -907,7 +906,7 @@ static inline void *tcg_malloc(int size)
     }
 }
 
-void tcg_context_init(TCGContext *s);
+void tcg_init(size_t tb_size, int splitwx);
 void tcg_register_thread(void);
 void tcg_prologue_init(TCGContext *s);
 void tcg_func_start(TCGContext *s);
diff --git a/tcg/internal.h b/tcg/internal.h
index b1dda343c2..f13c564d9b 100644
--- a/tcg/internal.h
+++ b/tcg/internal.h
@@ -30,6 +30,7 @@
 extern TCGContext **tcg_ctxs;
 extern unsigned int n_tcg_ctxs;
 
+void tcg_region_init(size_t tb_size, int splitwx);
 bool tcg_region_alloc(TCGContext *s);
 void tcg_region_initial_alloc(TCGContext *s);
 void tcg_region_prologue_set(TCGContext *s);
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 4071edda16..050b4bff46 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -920,10 +920,9 @@ static void tb_htable_init(void)
 void tcg_exec_init(unsigned long tb_size, int splitwx)
 {
     tcg_allowed = true;
-    tcg_context_init(&tcg_init_ctx);
     page_init();
     tb_htable_init();
-    tcg_region_init(tb_size, splitwx);
+    tcg_init(tb_size, splitwx);
 
 #if defined(CONFIG_SOFTMMU)
     /* There's no guest base to take into account, so go ahead and
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 10a571d41c..65a63bda8a 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -576,8 +576,9 @@ static void process_op_defs(TCGContext *s);
 static TCGTemp *tcg_global_reg_new_internal(TCGContext *s, TCGType type,
                                             TCGReg reg, const char *name);
 
-void tcg_context_init(TCGContext *s)
+static void tcg_context_init(void)
 {
+    TCGContext *s = &tcg_init_ctx;
     int op, total_args, n, i;
     TCGOpDef *def;
     TCGArgConstraint *args_ct;
@@ -654,6 +655,12 @@ void tcg_context_init(TCGContext *s)
     cpu_env = temp_tcgv_ptr(ts);
 }
 
+void tcg_init(size_t tb_size, int splitwx)
+{
+    tcg_context_init();
+    tcg_region_init(tb_size, splitwx);
+}
+
 /*
  * Allocate TBs right before their corresponding translated code, making
  * sure that TBs and code are on different cache lines.
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 12/29] accel/tcg: Merge tcg_exec_init into tcg_init_machine
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (10 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 11/29] tcg: Create tcg_init Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 13/29] accel/tcg: Pass down max_cpus to tcg_init Richard Henderson
                   ` (18 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

There is only one caller, and shortly we will need access
to the MachineState, which tcg_init_machine already has.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/internal.h      |  2 ++
 include/sysemu/tcg.h      |  2 --
 accel/tcg/tcg-all.c       | 14 +++++++++++++-
 accel/tcg/translate-all.c | 21 ++-------------------
 4 files changed, 17 insertions(+), 22 deletions(-)

diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
index e9c145e0fb..881bc1ede0 100644
--- a/accel/tcg/internal.h
+++ b/accel/tcg/internal.h
@@ -16,5 +16,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu, target_ulong pc,
                               int cflags);
 
 void QEMU_NORETURN cpu_io_recompile(CPUState *cpu, uintptr_t retaddr);
+void page_init(void);
+void tb_htable_init(void);
 
 #endif /* ACCEL_TCG_INTERNAL_H */
diff --git a/include/sysemu/tcg.h b/include/sysemu/tcg.h
index 00349fb18a..53352450ff 100644
--- a/include/sysemu/tcg.h
+++ b/include/sysemu/tcg.h
@@ -8,8 +8,6 @@
 #ifndef SYSEMU_TCG_H
 #define SYSEMU_TCG_H
 
-void tcg_exec_init(unsigned long tb_size, int splitwx);
-
 #ifdef CONFIG_TCG
 extern bool tcg_allowed;
 #define tcg_enabled() (tcg_allowed)
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index 30d81ff7f5..0e83acbfe5 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -32,6 +32,7 @@
 #include "qemu/error-report.h"
 #include "qemu/accel.h"
 #include "qapi/qapi-builtin-visit.h"
+#include "internal.h"
 
 struct TCGState {
     AccelState parent_obj;
@@ -109,8 +110,19 @@ static int tcg_init_machine(MachineState *ms)
 {
     TCGState *s = TCG_STATE(current_accel());
 
-    tcg_exec_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
+    tcg_allowed = true;
     mttcg_enabled = s->mttcg_enabled;
+
+    page_init();
+    tb_htable_init();
+    tcg_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
+
+#if defined(CONFIG_SOFTMMU)
+    /* There's no guest base to take into account, so go ahead and
+       initialize the prologue now.  */
+    tcg_prologue_init(tcg_ctx);
+#endif
+
     return 0;
 }
 
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 050b4bff46..40aeecf611 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -408,7 +408,7 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t host_pc, bool will_exit)
     return false;
 }
 
-static void page_init(void)
+void page_init(void)
 {
     page_size_init();
     page_table_config_init();
@@ -907,30 +907,13 @@ static bool tb_cmp(const void *ap, const void *bp)
         a->page_addr[1] == b->page_addr[1];
 }
 
-static void tb_htable_init(void)
+void tb_htable_init(void)
 {
     unsigned int mode = QHT_MODE_AUTO_RESIZE;
 
     qht_init(&tb_ctx.htable, tb_cmp, CODE_GEN_HTABLE_SIZE, mode);
 }
 
-/* Must be called before using the QEMU cpus. 'tb_size' is the size
-   (in bytes) allocated to the translation buffer. Zero means default
-   size. */
-void tcg_exec_init(unsigned long tb_size, int splitwx)
-{
-    tcg_allowed = true;
-    page_init();
-    tb_htable_init();
-    tcg_init(tb_size, splitwx);
-
-#if defined(CONFIG_SOFTMMU)
-    /* There's no guest base to take into account, so go ahead and
-       initialize the prologue now.  */
-    tcg_prologue_init(tcg_ctx);
-#endif
-}
-
 /* call with @p->lock held */
 static inline void invalidate_page_bitmap(PageDesc *p)
 {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 13/29] accel/tcg: Pass down max_cpus to tcg_init
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (11 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 12/29] accel/tcg: Merge tcg_exec_init into tcg_init_machine Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 14/29] tcg: Introduce tcg_max_ctxs Richard Henderson
                   ` (17 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé

Start removing the include of hw/boards.h from tcg/.
Pass down the max_cpus value from tcg_init_machine,
where we have the MachineState already.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/tcg/tcg.h   |  2 +-
 tcg/internal.h      |  2 +-
 accel/tcg/tcg-all.c | 10 +++++++++-
 tcg/region.c        | 32 +++++++++++---------------------
 tcg/tcg.c           | 10 ++++------
 5 files changed, 26 insertions(+), 30 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 3ad77ec34d..a0122c0dd3 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -906,7 +906,7 @@ static inline void *tcg_malloc(int size)
     }
 }
 
-void tcg_init(size_t tb_size, int splitwx);
+void tcg_init(size_t tb_size, int splitwx, unsigned max_cpus);
 void tcg_register_thread(void);
 void tcg_prologue_init(TCGContext *s);
 void tcg_func_start(TCGContext *s);
diff --git a/tcg/internal.h b/tcg/internal.h
index f13c564d9b..fcfeca232f 100644
--- a/tcg/internal.h
+++ b/tcg/internal.h
@@ -30,7 +30,7 @@
 extern TCGContext **tcg_ctxs;
 extern unsigned int n_tcg_ctxs;
 
-void tcg_region_init(size_t tb_size, int splitwx);
+void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus);
 bool tcg_region_alloc(TCGContext *s);
 void tcg_region_initial_alloc(TCGContext *s);
 void tcg_region_prologue_set(TCGContext *s);
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index 0e83acbfe5..d2f2ddb844 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -32,6 +32,9 @@
 #include "qemu/error-report.h"
 #include "qemu/accel.h"
 #include "qapi/qapi-builtin-visit.h"
+#if !defined(CONFIG_USER_ONLY)
+#include "hw/boards.h"
+#endif
 #include "internal.h"
 
 struct TCGState {
@@ -109,13 +112,18 @@ bool mttcg_enabled;
 static int tcg_init_machine(MachineState *ms)
 {
     TCGState *s = TCG_STATE(current_accel());
+#ifdef CONFIG_USER_ONLY
+    unsigned max_cpus = 1;
+#else
+    unsigned max_cpus = ms->smp.max_cpus;
+#endif
 
     tcg_allowed = true;
     mttcg_enabled = s->mttcg_enabled;
 
     page_init();
     tb_htable_init();
-    tcg_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
+    tcg_init(s->tb_size * 1024 * 1024, s->splitwx_enabled, max_cpus);
 
 #if defined(CONFIG_SOFTMMU)
     /* There's no guest base to take into account, so go ahead and
diff --git a/tcg/region.c b/tcg/region.c
index 8d88144a22..04b699da63 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -27,9 +27,6 @@
 #include "qapi/error.h"
 #include "exec/exec-all.h"
 #include "tcg/tcg.h"
-#if !defined(CONFIG_USER_ONLY)
-#include "hw/boards.h"
-#endif
 #include "internal.h"
 
 
@@ -366,27 +363,20 @@ void tcg_region_reset_all(void)
     tcg_region_tree_reset_all();
 }
 
+static size_t tcg_n_regions(unsigned max_cpus)
+{
 #ifdef CONFIG_USER_ONLY
-static size_t tcg_n_regions(void)
-{
     return 1;
-}
 #else
-/*
- * It is likely that some vCPUs will translate more code than others, so we
- * first try to set more regions than max_cpus, with those regions being of
- * reasonable size. If that's not possible we make do by evenly dividing
- * the code_gen_buffer among the vCPUs.
- */
-static size_t tcg_n_regions(void)
-{
+    /*
+     * It is likely that some vCPUs will translate more code than others,
+     * so we first try to set more regions than max_cpus, with those regions
+     * being of reasonable size. If that's not possible we make do by evenly
+     * dividing the code_gen_buffer among the vCPUs.
+     */
     size_t i;
 
     /* Use a single region if all we have is one vCPU thread */
-#if !defined(CONFIG_USER_ONLY)
-    MachineState *ms = MACHINE(qdev_get_machine());
-    unsigned int max_cpus = ms->smp.max_cpus;
-#endif
     if (max_cpus == 1 || !qemu_tcg_mttcg_enabled()) {
         return 1;
     }
@@ -405,8 +395,8 @@ static size_t tcg_n_regions(void)
     }
     /* If we can't, then just allocate one region per vCPU thread */
     return max_cpus;
-}
 #endif
+}
 
 /* Minimum size of the code gen buffer.  This number is randomly chosen,
    but not so small that we can't have a fair number of TB's live.  */
@@ -838,7 +828,7 @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
  * in practice. Multi-threaded guests share most if not all of their translated
  * code, which makes parallel code generation less appealing than in softmmu.
  */
-void tcg_region_init(size_t tb_size, int splitwx)
+void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
 {
     void *buf, *aligned;
     size_t size;
@@ -856,7 +846,7 @@ void tcg_region_init(size_t tb_size, int splitwx)
     buf = tcg_init_ctx.code_gen_buffer;
     size = tcg_init_ctx.code_gen_buffer_size;
     page_size = qemu_real_host_page_size;
-    n_regions = tcg_n_regions();
+    n_regions = tcg_n_regions(max_cpus);
 
     /* The first region will be 'aligned - buf' bytes larger than the others */
     aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 65a63bda8a..a89d8f6b81 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -576,7 +576,7 @@ static void process_op_defs(TCGContext *s);
 static TCGTemp *tcg_global_reg_new_internal(TCGContext *s, TCGType type,
                                             TCGReg reg, const char *name);
 
-static void tcg_context_init(void)
+static void tcg_context_init(unsigned max_cpus)
 {
     TCGContext *s = &tcg_init_ctx;
     int op, total_args, n, i;
@@ -645,8 +645,6 @@ static void tcg_context_init(void)
     tcg_ctxs = &tcg_ctx;
     n_tcg_ctxs = 1;
 #else
-    MachineState *ms = MACHINE(qdev_get_machine());
-    unsigned int max_cpus = ms->smp.max_cpus;
     tcg_ctxs = g_new(TCGContext *, max_cpus);
 #endif
 
@@ -655,10 +653,10 @@ static void tcg_context_init(void)
     cpu_env = temp_tcgv_ptr(ts);
 }
 
-void tcg_init(size_t tb_size, int splitwx)
+void tcg_init(size_t tb_size, int splitwx, unsigned max_cpus)
 {
-    tcg_context_init();
-    tcg_region_init(tb_size, splitwx);
+    tcg_context_init(max_cpus);
+    tcg_region_init(tb_size, splitwx, max_cpus);
 }
 
 /*
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 14/29] tcg: Introduce tcg_max_ctxs
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (12 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 13/29] accel/tcg: Pass down max_cpus to tcg_init Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 15/29] tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h Richard Henderson
                   ` (16 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé

Finish the divorce of tcg/ from hw/, and do not take
the max cpu value from MachineState; just remember what
we were passed in tcg_init.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/internal.h |  3 ++-
 tcg/region.c   |  6 +++---
 tcg/tcg.c      | 23 ++++++++++-------------
 3 files changed, 15 insertions(+), 17 deletions(-)

diff --git a/tcg/internal.h b/tcg/internal.h
index fcfeca232f..f9906523da 100644
--- a/tcg/internal.h
+++ b/tcg/internal.h
@@ -28,7 +28,8 @@
 #define TCG_HIGHWATER 1024
 
 extern TCGContext **tcg_ctxs;
-extern unsigned int n_tcg_ctxs;
+extern unsigned int tcg_cur_ctxs;
+extern unsigned int tcg_max_ctxs;
 
 void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus);
 bool tcg_region_alloc(TCGContext *s);
diff --git a/tcg/region.c b/tcg/region.c
index 04b699da63..e3fbf6a7e7 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -347,7 +347,7 @@ void tcg_region_initial_alloc(TCGContext *s)
 /* Call from a safe-work context */
 void tcg_region_reset_all(void)
 {
-    unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+    unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
     unsigned int i;
 
     qemu_mutex_lock(&region.lock);
@@ -922,7 +922,7 @@ void tcg_region_prologue_set(TCGContext *s)
  */
 size_t tcg_code_size(void)
 {
-    unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+    unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
     unsigned int i;
     size_t total;
 
@@ -958,7 +958,7 @@ size_t tcg_code_capacity(void)
 
 size_t tcg_tb_phys_invalidate_count(void)
 {
-    unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+    unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
     unsigned int i;
     size_t total = 0;
 
diff --git a/tcg/tcg.c b/tcg/tcg.c
index a89d8f6b81..a82d3a0861 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -44,11 +44,6 @@
 #include "cpu.h"
 
 #include "exec/exec-all.h"
-
-#if !defined(CONFIG_USER_ONLY)
-#include "hw/boards.h"
-#endif
-
 #include "tcg/tcg-op.h"
 
 #if UINTPTR_MAX == UINT32_MAX
@@ -155,7 +150,8 @@ static int tcg_out_ldst_finalize(TCGContext *s);
 #endif
 
 TCGContext **tcg_ctxs;
-unsigned int n_tcg_ctxs;
+unsigned int tcg_cur_ctxs;
+unsigned int tcg_max_ctxs;
 TCGv_env cpu_env = 0;
 const void *tcg_code_gen_epilogue;
 uintptr_t tcg_splitwx_diff;
@@ -475,7 +471,6 @@ void tcg_register_thread(void)
 #else
 void tcg_register_thread(void)
 {
-    MachineState *ms = MACHINE(qdev_get_machine());
     TCGContext *s = g_malloc(sizeof(*s));
     unsigned int i, n;
 
@@ -491,8 +486,8 @@ void tcg_register_thread(void)
     }
 
     /* Claim an entry in tcg_ctxs */
-    n = qatomic_fetch_inc(&n_tcg_ctxs);
-    g_assert(n < ms->smp.max_cpus);
+    n = qatomic_fetch_inc(&tcg_cur_ctxs);
+    g_assert(n < tcg_max_ctxs);
     qatomic_set(&tcg_ctxs[n], s);
 
     if (n > 0) {
@@ -643,9 +638,11 @@ static void tcg_context_init(unsigned max_cpus)
      */
 #ifdef CONFIG_USER_ONLY
     tcg_ctxs = &tcg_ctx;
-    n_tcg_ctxs = 1;
+    tcg_cur_ctxs = 1;
+    tcg_max_ctxs = 1;
 #else
-    tcg_ctxs = g_new(TCGContext *, max_cpus);
+    tcg_max_ctxs = max_cpus;
+    tcg_ctxs = g_new0(TCGContext *, max_cpus);
 #endif
 
     tcg_debug_assert(!tcg_regset_test_reg(s->reserved_regs, TCG_AREG0));
@@ -3937,7 +3934,7 @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op)
 static inline
 void tcg_profile_snapshot(TCGProfile *prof, bool counters, bool table)
 {
-    unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+    unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
     unsigned int i;
 
     for (i = 0; i < n_ctxs; i++) {
@@ -4000,7 +3997,7 @@ void tcg_dump_op_count(void)
 
 int64_t tcg_cpu_exec_time(void)
 {
-    unsigned int n_ctxs = qatomic_read(&n_tcg_ctxs);
+    unsigned int n_ctxs = qatomic_read(&tcg_cur_ctxs);
     unsigned int i;
     int64_t ret = 0;
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 15/29] tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (13 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 14/29] tcg: Introduce tcg_max_ctxs Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 16/29] tcg: Replace region.end with region.total_size Richard Henderson
                   ` (15 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Remove the ifdef ladder and move each define into the
appropriate header file.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
v2: Retain comment about M_C_G_B_S constraint (balaton)
---
 tcg/aarch64/tcg-target.h |  1 +
 tcg/arm/tcg-target.h     |  1 +
 tcg/i386/tcg-target.h    |  2 ++
 tcg/mips/tcg-target.h    |  6 ++++++
 tcg/ppc/tcg-target.h     |  2 ++
 tcg/riscv/tcg-target.h   |  1 +
 tcg/s390/tcg-target.h    |  3 +++
 tcg/sparc/tcg-target.h   |  1 +
 tcg/tci/tcg-target.h     |  1 +
 tcg/region.c             | 35 +++++++++--------------------------
 10 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index 5ec30dba25..ef55f7c185 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -15,6 +15,7 @@
 
 #define TCG_TARGET_INSN_UNIT_SIZE  4
 #define TCG_TARGET_TLB_DISPLACEMENT_BITS 24
+#define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
 #undef TCG_TARGET_STACK_GROWSUP
 
 typedef enum {
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index 8d1fee6327..b9a85d0f83 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -60,6 +60,7 @@ extern int arm_arch;
 #undef TCG_TARGET_STACK_GROWSUP
 #define TCG_TARGET_INSN_UNIT_SIZE 4
 #define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
+#define MAX_CODE_GEN_BUFFER_SIZE  UINT32_MAX
 
 typedef enum {
     TCG_REG_R0 = 0,
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index b693d3692d..ac10066c3e 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -31,9 +31,11 @@
 #ifdef __x86_64__
 # define TCG_TARGET_REG_BITS  64
 # define TCG_TARGET_NB_REGS   32
+# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
 #else
 # define TCG_TARGET_REG_BITS  32
 # define TCG_TARGET_NB_REGS   24
+# define MAX_CODE_GEN_BUFFER_SIZE  UINT32_MAX
 #endif
 
 typedef enum {
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index c2c32fb38f..e81e824cab 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -39,6 +39,12 @@
 #define TCG_TARGET_TLB_DISPLACEMENT_BITS 16
 #define TCG_TARGET_NB_REGS 32
 
+/*
+ * We have a 256MB branch region, but leave room to make sure the
+ * main executable is also within that region.
+ */
+#define MAX_CODE_GEN_BUFFER_SIZE  (128 * MiB)
+
 typedef enum {
     TCG_REG_ZERO = 0,
     TCG_REG_AT,
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index d1339afc66..c13ed5640a 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -27,8 +27,10 @@
 
 #ifdef _ARCH_PPC64
 # define TCG_TARGET_REG_BITS  64
+# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
 #else
 # define TCG_TARGET_REG_BITS  32
+# define MAX_CODE_GEN_BUFFER_SIZE  (32 * MiB)
 #endif
 
 #define TCG_TARGET_NB_REGS 64
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index 727c8df418..87ea94666b 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -34,6 +34,7 @@
 #define TCG_TARGET_INSN_UNIT_SIZE 4
 #define TCG_TARGET_TLB_DISPLACEMENT_BITS 20
 #define TCG_TARGET_NB_REGS 32
+#define MAX_CODE_GEN_BUFFER_SIZE  ((size_t)-1)
 
 typedef enum {
     TCG_REG_ZERO,
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index 641464eea4..b04b72b7eb 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -28,6 +28,9 @@
 #define TCG_TARGET_INSN_UNIT_SIZE 2
 #define TCG_TARGET_TLB_DISPLACEMENT_BITS 19
 
+/* We have a +- 4GB range on the branches; leave some slop.  */
+#define MAX_CODE_GEN_BUFFER_SIZE  (3 * GiB)
+
 typedef enum TCGReg {
     TCG_REG_R0 = 0,
     TCG_REG_R1,
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
index f66f5d07dc..86bb9a2d39 100644
--- a/tcg/sparc/tcg-target.h
+++ b/tcg/sparc/tcg-target.h
@@ -30,6 +30,7 @@
 #define TCG_TARGET_INSN_UNIT_SIZE 4
 #define TCG_TARGET_TLB_DISPLACEMENT_BITS 32
 #define TCG_TARGET_NB_REGS 32
+#define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
 
 typedef enum {
     TCG_REG_G0 = 0,
diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
index 9c0021a26f..03cf527cb4 100644
--- a/tcg/tci/tcg-target.h
+++ b/tcg/tci/tcg-target.h
@@ -43,6 +43,7 @@
 #define TCG_TARGET_INTERPRETER 1
 #define TCG_TARGET_INSN_UNIT_SIZE 1
 #define TCG_TARGET_TLB_DISPLACEMENT_BITS 32
+#define MAX_CODE_GEN_BUFFER_SIZE  ((size_t)-1)
 
 #if UINTPTR_MAX == UINT32_MAX
 # define TCG_TARGET_REG_BITS 32
diff --git a/tcg/region.c b/tcg/region.c
index e3fbf6a7e7..ae22308290 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -398,34 +398,17 @@ static size_t tcg_n_regions(unsigned max_cpus)
 #endif
 }
 
-/* Minimum size of the code gen buffer.  This number is randomly chosen,
-   but not so small that we can't have a fair number of TB's live.  */
+/*
+ * Minimum size of the code gen buffer.  This number is randomly chosen,
+ * but not so small that we can't have a fair number of TB's live.
+ *
+ * Maximum size, MAX_CODE_GEN_BUFFER_SIZE, is defined in tcg-target.h.
+ * Unless otherwise indicated, this is constrained by the range of
+ * direct branches on the host cpu, as used by the TCG implementation
+ * of goto_tb.
+ */
 #define MIN_CODE_GEN_BUFFER_SIZE     (1 * MiB)
 
-/* Maximum size of the code gen buffer we'd like to use.  Unless otherwise
-   indicated, this is constrained by the range of direct branches on the
-   host cpu, as used by the TCG implementation of goto_tb.  */
-#if defined(__x86_64__)
-# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
-#elif defined(__sparc__)
-# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
-#elif defined(__powerpc64__)
-# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
-#elif defined(__powerpc__)
-# define MAX_CODE_GEN_BUFFER_SIZE  (32 * MiB)
-#elif defined(__aarch64__)
-# define MAX_CODE_GEN_BUFFER_SIZE  (2 * GiB)
-#elif defined(__s390x__)
-  /* We have a +- 4GB range on the branches; leave some slop.  */
-# define MAX_CODE_GEN_BUFFER_SIZE  (3 * GiB)
-#elif defined(__mips__)
-  /* We have a 256MB branch region, but leave room to make sure the
-     main executable is also within that region.  */
-# define MAX_CODE_GEN_BUFFER_SIZE  (128 * MiB)
-#else
-# define MAX_CODE_GEN_BUFFER_SIZE  ((size_t)-1)
-#endif
-
 #if TCG_TARGET_REG_BITS == 32
 #define DEFAULT_CODE_GEN_BUFFER_SIZE_1 (32 * MiB)
 #ifdef CONFIG_USER_ONLY
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 16/29] tcg: Replace region.end with region.total_size
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (14 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 15/29] tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 17/29] tcg: Rename region.start to region.after_prologue Richard Henderson
                   ` (14 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

A size is easier to work with than an end point,
particularly during initial buffer allocation.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/region.c | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/tcg/region.c b/tcg/region.c
index ae22308290..8e4dd0480b 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -48,7 +48,7 @@ struct tcg_region_state {
     /* fields set at init time */
     void *start;
     void *start_aligned;
-    void *end;
+    size_t total_size; /* size of entire buffer */
     size_t n;
     size_t size; /* size of one region */
     size_t stride; /* .size + guard size */
@@ -279,7 +279,7 @@ static void tcg_region_bounds(size_t curr_region, void **pstart, void **pend)
         start = region.start;
     }
     if (curr_region == region.n - 1) {
-        end = region.end;
+        end = region.start_aligned + region.total_size;
     }
 
     *pstart = start;
@@ -813,8 +813,8 @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
  */
 void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
 {
-    void *buf, *aligned;
-    size_t size;
+    void *buf, *aligned, *end;
+    size_t total_size;
     size_t page_size;
     size_t region_size;
     size_t n_regions;
@@ -827,19 +827,20 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
     assert(ok);
 
     buf = tcg_init_ctx.code_gen_buffer;
-    size = tcg_init_ctx.code_gen_buffer_size;
+    total_size = tcg_init_ctx.code_gen_buffer_size;
     page_size = qemu_real_host_page_size;
     n_regions = tcg_n_regions(max_cpus);
 
     /* The first region will be 'aligned - buf' bytes larger than the others */
     aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
-    g_assert(aligned < tcg_init_ctx.code_gen_buffer + size);
+    g_assert(aligned < tcg_init_ctx.code_gen_buffer + total_size);
+
     /*
      * Make region_size a multiple of page_size, using aligned as the start.
      * As a result of this we might end up with a few extra pages at the end of
      * the buffer; we will assign those to the last region.
      */
-    region_size = (size - (aligned - buf)) / n_regions;
+    region_size = (total_size - (aligned - buf)) / n_regions;
     region_size = QEMU_ALIGN_DOWN(region_size, page_size);
 
     /* A region must have at least 2 pages; one code, one guard */
@@ -853,9 +854,11 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
     region.start = buf;
     region.start_aligned = aligned;
     /* page-align the end, since its last page will be a guard page */
-    region.end = QEMU_ALIGN_PTR_DOWN(buf + size, page_size);
+    end = QEMU_ALIGN_PTR_DOWN(buf + total_size, page_size);
     /* account for that last guard page */
-    region.end -= page_size;
+    end -= page_size;
+    total_size = end - aligned;
+    region.total_size = total_size;
 
     /* set guard pages */
     splitwx_diff = tcg_splitwx_diff;
@@ -893,7 +896,7 @@ void tcg_region_prologue_set(TCGContext *s)
 
     /* Register the balance of the buffer with gdb. */
     tcg_register_jit(tcg_splitwx_to_rx(region.start),
-                     region.end - region.start);
+                     region.start_aligned + region.total_size - region.start);
 }
 
 /*
@@ -934,8 +937,10 @@ size_t tcg_code_capacity(void)
 
     /* no need for synchronization; these variables are set at init time */
     guard_size = region.stride - region.size;
-    capacity = region.end + guard_size - region.start;
-    capacity -= region.n * (guard_size + TCG_HIGHWATER);
+    capacity = region.total_size;
+    capacity -= (region.n - 1) * guard_size;
+    capacity -= region.n * TCG_HIGHWATER;
+
     return capacity;
 }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 17/29] tcg: Rename region.start to region.after_prologue
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (15 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 16/29] tcg: Replace region.end with region.total_size Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 18/29] tcg: Tidy tcg_n_regions Richard Henderson
                   ` (13 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Give the field a name reflecting its actual meaning.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/region.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/tcg/region.c b/tcg/region.c
index 8e4dd0480b..23261561a1 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -46,8 +46,8 @@ struct tcg_region_state {
     QemuMutex lock;
 
     /* fields set at init time */
-    void *start;
     void *start_aligned;
+    void *after_prologue;
     size_t total_size; /* size of entire buffer */
     size_t n;
     size_t size; /* size of one region */
@@ -276,7 +276,7 @@ static void tcg_region_bounds(size_t curr_region, void **pstart, void **pend)
     end = start + region.size;
 
     if (curr_region == 0) {
-        start = region.start;
+        start = region.after_prologue;
     }
     if (curr_region == region.n - 1) {
         end = region.start_aligned + region.total_size;
@@ -851,7 +851,7 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
     region.n = n_regions;
     region.size = region_size - page_size;
     region.stride = region_size;
-    region.start = buf;
+    region.after_prologue = buf;
     region.start_aligned = aligned;
     /* page-align the end, since its last page will be a guard page */
     end = QEMU_ALIGN_PTR_DOWN(buf + total_size, page_size);
@@ -888,15 +888,16 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
 void tcg_region_prologue_set(TCGContext *s)
 {
     /* Deduct the prologue from the first region.  */
-    g_assert(region.start == s->code_gen_buffer);
-    region.start = s->code_ptr;
+    g_assert(region.start_aligned == s->code_gen_buffer);
+    region.after_prologue = s->code_ptr;
 
     /* Recompute boundaries of the first region. */
     tcg_region_assign(s, 0);
 
     /* Register the balance of the buffer with gdb. */
-    tcg_register_jit(tcg_splitwx_to_rx(region.start),
-                     region.start_aligned + region.total_size - region.start);
+    tcg_register_jit(tcg_splitwx_to_rx(region.after_prologue),
+                     region.start_aligned + region.total_size -
+                     region.after_prologue);
 }
 
 /*
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 18/29] tcg: Tidy tcg_n_regions
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (16 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 17/29] tcg: Rename region.start to region.after_prologue Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 19/29] tcg: Tidy split_cross_256mb Richard Henderson
                   ` (12 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Compute the value using straight division and bounds,
rather than a loop.  Pass in tb_size rather than reading
from tcg_init_ctx.code_gen_buffer_size,

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/region.c | 29 ++++++++++++-----------------
 1 file changed, 12 insertions(+), 17 deletions(-)

diff --git a/tcg/region.c b/tcg/region.c
index 23261561a1..23b3459c61 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -363,38 +363,33 @@ void tcg_region_reset_all(void)
     tcg_region_tree_reset_all();
 }
 
-static size_t tcg_n_regions(unsigned max_cpus)
+static size_t tcg_n_regions(size_t tb_size, unsigned max_cpus)
 {
 #ifdef CONFIG_USER_ONLY
     return 1;
 #else
+    size_t n_regions;
+
     /*
      * It is likely that some vCPUs will translate more code than others,
      * so we first try to set more regions than max_cpus, with those regions
      * being of reasonable size. If that's not possible we make do by evenly
      * dividing the code_gen_buffer among the vCPUs.
      */
-    size_t i;
-
     /* Use a single region if all we have is one vCPU thread */
     if (max_cpus == 1 || !qemu_tcg_mttcg_enabled()) {
         return 1;
     }
 
-    /* Try to have more regions than max_cpus, with each region being >= 2 MB */
-    for (i = 8; i > 0; i--) {
-        size_t regions_per_thread = i;
-        size_t region_size;
-
-        region_size = tcg_init_ctx.code_gen_buffer_size;
-        region_size /= max_cpus * regions_per_thread;
-
-        if (region_size >= 2 * 1024u * 1024) {
-            return max_cpus * regions_per_thread;
-        }
+    /*
+     * Try to have more regions than max_cpus, with each region being >= 2 MB.
+     * If we can't, then just allocate one region per vCPU thread.
+     */
+    n_regions = tb_size / (2 * MiB);
+    if (n_regions <= max_cpus) {
+        return max_cpus;
     }
-    /* If we can't, then just allocate one region per vCPU thread */
-    return max_cpus;
+    return MIN(n_regions, max_cpus * 8);
 #endif
 }
 
@@ -829,7 +824,7 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
     buf = tcg_init_ctx.code_gen_buffer;
     total_size = tcg_init_ctx.code_gen_buffer_size;
     page_size = qemu_real_host_page_size;
-    n_regions = tcg_n_regions(max_cpus);
+    n_regions = tcg_n_regions(total_size, max_cpus);
 
     /* The first region will be 'aligned - buf' bytes larger than the others */
     aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 19/29] tcg: Tidy split_cross_256mb
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (17 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 18/29] tcg: Tidy tcg_n_regions Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 20/29] tcg: Move in_code_gen_buffer and tests to region.c Richard Henderson
                   ` (11 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Return output buffer and size via output pointer arguments,
rather than returning size via tcg_ctx->code_gen_buffer_size.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/region.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/tcg/region.c b/tcg/region.c
index 23b3459c61..45c1178815 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -467,7 +467,8 @@ static inline bool cross_256mb(void *addr, size_t size)
 /* We weren't able to allocate a buffer without crossing that boundary,
    so make do with the larger portion of the buffer that doesn't cross.
    Returns the new base of the buffer, and adjusts code_gen_buffer_size.  */
-static inline void *split_cross_256mb(void *buf1, size_t size1)
+static inline void split_cross_256mb(void **obuf, size_t *osize,
+                                     void *buf1, size_t size1)
 {
     void *buf2 = (void *)(((uintptr_t)buf1 + size1) & ~0x0ffffffful);
     size_t size2 = buf1 + size1 - buf2;
@@ -478,8 +479,8 @@ static inline void *split_cross_256mb(void *buf1, size_t size1)
         buf1 = buf2;
     }
 
-    tcg_ctx->code_gen_buffer_size = size1;
-    return buf1;
+    *obuf = buf1;
+    *osize = size1;
 }
 #endif
 
@@ -509,12 +510,10 @@ static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
     if (size > tb_size) {
         size = QEMU_ALIGN_DOWN(tb_size, qemu_real_host_page_size);
     }
-    tcg_ctx->code_gen_buffer_size = size;
 
 #ifdef __mips__
     if (cross_256mb(buf, size)) {
-        buf = split_cross_256mb(buf, size);
-        size = tcg_ctx->code_gen_buffer_size;
+        split_cross_256mb(&buf, &size, buf, size);
     }
 #endif
 
@@ -525,6 +524,7 @@ static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
     qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
 
     tcg_ctx->code_gen_buffer = buf;
+    tcg_ctx->code_gen_buffer_size = size;
     return true;
 }
 #elif defined(_WIN32)
@@ -583,8 +583,7 @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
             /* fallthru */
         default:
             /* Split the original buffer.  Free the smaller half.  */
-            buf2 = split_cross_256mb(buf, size);
-            size2 = tcg_ctx->code_gen_buffer_size;
+            split_cross_256mb(&buf2, &size2, buf, size);
             if (buf == buf2) {
                 munmap(buf + size2, size - size2);
             } else {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 20/29] tcg: Move in_code_gen_buffer and tests to region.c
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (18 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 19/29] tcg: Tidy split_cross_256mb Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 21/29] tcg: Allocate code_gen_buffer into struct tcg_region_state Richard Henderson
                   ` (10 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Shortly, the full code_gen_buffer will only be visible
to region.c, so move in_code_gen_buffer out-of-line.

Move the debugging versions of tcg_splitwx_to_{rx,rw}
to region.c as well, so that the compiler gets to see
the implementation of in_code_gen_buffer.

This leaves exactly one use of in_code_gen_buffer outside
of region.c, in cpu_restore_state.  Which, being on the
exception path, is not performance critical.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/tcg/tcg.h | 11 +----------
 tcg/region.c      | 34 ++++++++++++++++++++++++++++++++++
 tcg/tcg.c         | 23 -----------------------
 3 files changed, 35 insertions(+), 33 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index a0122c0dd3..a19deb529f 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -696,16 +696,7 @@ extern const void *tcg_code_gen_epilogue;
 extern uintptr_t tcg_splitwx_diff;
 extern TCGv_env cpu_env;
 
-static inline bool in_code_gen_buffer(const void *p)
-{
-    const TCGContext *s = &tcg_init_ctx;
-    /*
-     * Much like it is valid to have a pointer to the byte past the
-     * end of an array (so long as you don't dereference it), allow
-     * a pointer to the byte past the end of the code gen buffer.
-     */
-    return (size_t)(p - s->code_gen_buffer) <= s->code_gen_buffer_size;
-}
+bool in_code_gen_buffer(const void *p);
 
 #ifdef CONFIG_DEBUG_TCG
 const void *tcg_splitwx_to_rx(void *rw);
diff --git a/tcg/region.c b/tcg/region.c
index 45c1178815..bf4167e467 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -68,6 +68,40 @@ static struct tcg_region_state region;
 static void *region_trees;
 static size_t tree_size;
 
+bool in_code_gen_buffer(const void *p)
+{
+    const TCGContext *s = &tcg_init_ctx;
+    /*
+     * Much like it is valid to have a pointer to the byte past the
+     * end of an array (so long as you don't dereference it), allow
+     * a pointer to the byte past the end of the code gen buffer.
+     */
+    return (size_t)(p - s->code_gen_buffer) <= s->code_gen_buffer_size;
+}
+
+#ifdef CONFIG_DEBUG_TCG
+const void *tcg_splitwx_to_rx(void *rw)
+{
+    /* Pass NULL pointers unchanged. */
+    if (rw) {
+        g_assert(in_code_gen_buffer(rw));
+        rw += tcg_splitwx_diff;
+    }
+    return rw;
+}
+
+void *tcg_splitwx_to_rw(const void *rx)
+{
+    /* Pass NULL pointers unchanged. */
+    if (rx) {
+        rx -= tcg_splitwx_diff;
+        /* Assert that we end with a pointer in the rw region. */
+        g_assert(in_code_gen_buffer(rx));
+    }
+    return (void *)rx;
+}
+#endif /* CONFIG_DEBUG_TCG */
+
 /* compare a pointer @ptr and a tb_tc @s */
 static int ptr_cmp_tb_tc(const void *ptr, const struct tb_tc *s)
 {
diff --git a/tcg/tcg.c b/tcg/tcg.c
index a82d3a0861..65f9cf01d5 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -416,29 +416,6 @@ static const TCGTargetOpDef constraint_sets[] = {
 
 #include "tcg-target.c.inc"
 
-#ifdef CONFIG_DEBUG_TCG
-const void *tcg_splitwx_to_rx(void *rw)
-{
-    /* Pass NULL pointers unchanged. */
-    if (rw) {
-        g_assert(in_code_gen_buffer(rw));
-        rw += tcg_splitwx_diff;
-    }
-    return rw;
-}
-
-void *tcg_splitwx_to_rw(const void *rx)
-{
-    /* Pass NULL pointers unchanged. */
-    if (rx) {
-        rx -= tcg_splitwx_diff;
-        /* Assert that we end with a pointer in the rw region. */
-        g_assert(in_code_gen_buffer(rx));
-    }
-    return (void *)rx;
-}
-#endif /* CONFIG_DEBUG_TCG */
-
 static void alloc_tcg_plugin_context(TCGContext *s)
 {
 #ifdef CONFIG_PLUGIN
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 21/29] tcg: Allocate code_gen_buffer into struct tcg_region_state
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (19 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 20/29] tcg: Move in_code_gen_buffer and tests to region.c Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 22/29] tcg: Return the map protection from alloc_code_gen_buffer Richard Henderson
                   ` (9 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Do not mess around with setting values within tcg_init_ctx.
Put the values into 'region' directly, which is where they
will live for the lifetime of the program.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/region.c | 64 ++++++++++++++++++++++------------------------------
 1 file changed, 27 insertions(+), 37 deletions(-)

diff --git a/tcg/region.c b/tcg/region.c
index bf4167e467..9a2b014838 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -70,13 +70,12 @@ static size_t tree_size;
 
 bool in_code_gen_buffer(const void *p)
 {
-    const TCGContext *s = &tcg_init_ctx;
     /*
      * Much like it is valid to have a pointer to the byte past the
      * end of an array (so long as you don't dereference it), allow
      * a pointer to the byte past the end of the code gen buffer.
      */
-    return (size_t)(p - s->code_gen_buffer) <= s->code_gen_buffer_size;
+    return (size_t)(p - region.start_aligned) <= region.total_size;
 }
 
 #ifdef CONFIG_DEBUG_TCG
@@ -557,8 +556,8 @@ static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
     }
     qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
 
-    tcg_ctx->code_gen_buffer = buf;
-    tcg_ctx->code_gen_buffer_size = size;
+    region.start_aligned = buf;
+    region.total_size = size;
     return true;
 }
 #elif defined(_WIN32)
@@ -579,8 +578,8 @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
         return false;
     }
 
-    tcg_ctx->code_gen_buffer = buf;
-    tcg_ctx->code_gen_buffer_size = size;
+    region.start_aligned = buf;
+    region.total_size = size;
     return true;
 }
 #else
@@ -595,7 +594,6 @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
                          "allocate %zu bytes for jit buffer", size);
         return false;
     }
-    tcg_ctx->code_gen_buffer_size = size;
 
 #ifdef __mips__
     if (cross_256mb(buf, size)) {
@@ -633,7 +631,8 @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
     /* Request large pages for the buffer.  */
     qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
 
-    tcg_ctx->code_gen_buffer = buf;
+    region.start_aligned = buf;
+    region.total_size = size;
     return true;
 }
 
@@ -654,8 +653,8 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
         return false;
     }
     /* The size of the mapping may have been adjusted. */
-    size = tcg_ctx->code_gen_buffer_size;
-    buf_rx = tcg_ctx->code_gen_buffer;
+    buf_rx = region.start_aligned;
+    size = region.total_size;
 #endif
 
     buf_rw = qemu_memfd_alloc("tcg-jit", size, 0, &fd, errp);
@@ -677,8 +676,8 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
 #endif
 
     close(fd);
-    tcg_ctx->code_gen_buffer = buf_rw;
-    tcg_ctx->code_gen_buffer_size = size;
+    region.start_aligned = buf_rw;
+    region.total_size = size;
     tcg_splitwx_diff = buf_rx - buf_rw;
 
     /* Request large pages for the buffer and the splitwx.  */
@@ -729,7 +728,7 @@ static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
         return false;
     }
 
-    buf_rw = (mach_vm_address_t)tcg_ctx->code_gen_buffer;
+    buf_rw = region.start_aligned;
     buf_rx = 0;
     ret = mach_vm_remap(mach_task_self(),
                         &buf_rx,
@@ -841,11 +840,8 @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
  */
 void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
 {
-    void *buf, *aligned, *end;
-    size_t total_size;
     size_t page_size;
     size_t region_size;
-    size_t n_regions;
     size_t i;
     uintptr_t splitwx_diff;
     bool ok;
@@ -854,39 +850,33 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
                                splitwx, &error_fatal);
     assert(ok);
 
-    buf = tcg_init_ctx.code_gen_buffer;
-    total_size = tcg_init_ctx.code_gen_buffer_size;
-    page_size = qemu_real_host_page_size;
-    n_regions = tcg_n_regions(total_size, max_cpus);
-
-    /* The first region will be 'aligned - buf' bytes larger than the others */
-    aligned = QEMU_ALIGN_PTR_UP(buf, page_size);
-    g_assert(aligned < tcg_init_ctx.code_gen_buffer + total_size);
-
     /*
      * Make region_size a multiple of page_size, using aligned as the start.
      * As a result of this we might end up with a few extra pages at the end of
      * the buffer; we will assign those to the last region.
      */
-    region_size = (total_size - (aligned - buf)) / n_regions;
+    region.n = tcg_n_regions(region.total_size, max_cpus);
+    page_size = qemu_real_host_page_size;
+    region_size = region.total_size / region.n;
     region_size = QEMU_ALIGN_DOWN(region_size, page_size);
 
     /* A region must have at least 2 pages; one code, one guard */
     g_assert(region_size >= 2 * page_size);
+    region.stride = region_size;
+
+    /* Reserve space for guard pages. */
+    region.size = region_size - page_size;
+    region.total_size -= page_size;
+
+    /*
+     * The first region will be smaller than the others, via the prologue,
+     * which has yet to be allocated.  For now, the first region begins at
+     * the page boundary.
+     */
+    region.after_prologue = region.start_aligned;
 
     /* init the region struct */
     qemu_mutex_init(&region.lock);
-    region.n = n_regions;
-    region.size = region_size - page_size;
-    region.stride = region_size;
-    region.after_prologue = buf;
-    region.start_aligned = aligned;
-    /* page-align the end, since its last page will be a guard page */
-    end = QEMU_ALIGN_PTR_DOWN(buf + total_size, page_size);
-    /* account for that last guard page */
-    end -= page_size;
-    total_size = end - aligned;
-    region.total_size = total_size;
 
     /* set guard pages */
     splitwx_diff = tcg_splitwx_diff;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 22/29] tcg: Return the map protection from alloc_code_gen_buffer
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (20 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 21/29] tcg: Allocate code_gen_buffer into struct tcg_region_state Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 22:04   ` Philippe Mathieu-Daudé
  2021-03-14 21:27 ` [PATCH v2 23/29] tcg: Sink qemu_madvise call to common code Richard Henderson
                   ` (8 subsequent siblings)
  30 siblings, 1 reply; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Change the interface from a boolean error indication to a
negative error vs a non-negative protection.  For the moment
this is only interface change, not making use of the new data.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/region.c | 63 +++++++++++++++++++++++++++-------------------------
 1 file changed, 33 insertions(+), 30 deletions(-)

diff --git a/tcg/region.c b/tcg/region.c
index 9a2b014838..3ca0d01fa4 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -521,14 +521,14 @@ static inline void split_cross_256mb(void **obuf, size_t *osize,
 static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
     __attribute__((aligned(CODE_GEN_ALIGN)));
 
-static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
+static int alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
 {
     void *buf, *end;
     size_t size;
 
     if (splitwx > 0) {
         error_setg(errp, "jit split-wx not supported");
-        return false;
+        return -1;
     }
 
     /* page-align the beginning and end of the buffer */
@@ -558,16 +558,17 @@ static bool alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
 
     region.start_aligned = buf;
     region.total_size = size;
-    return true;
+
+    return PROT_READ | PROT_WRITE;
 }
 #elif defined(_WIN32)
-static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
+static int alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
 {
     void *buf;
 
     if (splitwx > 0) {
         error_setg(errp, "jit split-wx not supported");
-        return false;
+        return -1;
     }
 
     buf = VirtualAlloc(NULL, size, MEM_RESERVE | MEM_COMMIT,
@@ -580,11 +581,12 @@ static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
 
     region.start_aligned = buf;
     region.total_size = size;
-    return true;
+
+    return PAGE_READ | PAGE_WRITE | PAGE_EXEC;
 }
 #else
-static bool alloc_code_gen_buffer_anon(size_t size, int prot,
-                                       int flags, Error **errp)
+static int alloc_code_gen_buffer_anon(size_t size, int prot,
+                                      int flags, Error **errp)
 {
     void *buf;
 
@@ -592,7 +594,7 @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
     if (buf == MAP_FAILED) {
         error_setg_errno(errp, errno,
                          "allocate %zu bytes for jit buffer", size);
-        return false;
+        return -1;
     }
 
 #ifdef __mips__
@@ -633,7 +635,7 @@ static bool alloc_code_gen_buffer_anon(size_t size, int prot,
 
     region.start_aligned = buf;
     region.total_size = size;
-    return true;
+    return prot;
 }
 
 #ifndef CONFIG_TCG_INTERPRETER
@@ -647,9 +649,9 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
 
 #ifdef __mips__
     /* Find space for the RX mapping, vs the 256MiB regions. */
-    if (!alloc_code_gen_buffer_anon(size, PROT_NONE,
-                                    MAP_PRIVATE | MAP_ANONYMOUS |
-                                    MAP_NORESERVE, errp)) {
+    if (alloc_code_gen_buffer_anon(size, PROT_NONE,
+                                   MAP_PRIVATE | MAP_ANONYMOUS |
+                                   MAP_NORESERVE, errp) < 0) {
         return false;
     }
     /* The size of the mapping may have been adjusted. */
@@ -683,7 +685,7 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
     /* Request large pages for the buffer and the splitwx.  */
     qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
     qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
-    return true;
+    return PROT_READ | PROT_WRITE;
 
  fail_rx:
     error_setg_errno(errp, errno, "failed to map shared memory for execute");
@@ -697,7 +699,7 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
     if (fd >= 0) {
         close(fd);
     }
-    return false;
+    return -1;
 }
 #endif /* CONFIG_POSIX */
 
@@ -716,7 +718,7 @@ extern kern_return_t mach_vm_remap(vm_map_t target_task,
                                    vm_prot_t *max_protection,
                                    vm_inherit_t inheritance);
 
-static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
+static int alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
 {
     kern_return_t ret;
     mach_vm_address_t buf_rw, buf_rx;
@@ -725,7 +727,7 @@ static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
     /* Map the read-write portion via normal anon memory. */
     if (!alloc_code_gen_buffer_anon(size, PROT_READ | PROT_WRITE,
                                     MAP_PRIVATE | MAP_ANONYMOUS, errp)) {
-        return false;
+        return -1;
     }
 
     buf_rw = region.start_aligned;
@@ -745,23 +747,23 @@ static bool alloc_code_gen_buffer_splitwx_vmremap(size_t size, Error **errp)
         /* TODO: Convert "ret" to a human readable error message. */
         error_setg(errp, "vm_remap for jit splitwx failed");
         munmap((void *)buf_rw, size);
-        return false;
+        return -1;
     }
 
     if (mprotect((void *)buf_rx, size, PROT_READ | PROT_EXEC) != 0) {
         error_setg_errno(errp, errno, "mprotect for jit splitwx");
         munmap((void *)buf_rx, size);
         munmap((void *)buf_rw, size);
-        return false;
+        return -1;
     }
 
     tcg_splitwx_diff = buf_rx - buf_rw;
-    return true;
+    return PROT_READ | PROT_WRITE;
 }
 #endif /* CONFIG_DARWIN */
 #endif /* CONFIG_TCG_INTERPRETER */
 
-static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
+static int alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
 {
 #ifndef CONFIG_TCG_INTERPRETER
 # ifdef CONFIG_DARWIN
@@ -772,24 +774,25 @@ static bool alloc_code_gen_buffer_splitwx(size_t size, Error **errp)
 # endif
 #endif
     error_setg(errp, "jit split-wx not supported");
-    return false;
+    return -1;
 }
 
-static bool alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
+static int alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
 {
     ERRP_GUARD();
     int prot, flags;
 
     if (splitwx) {
-        if (alloc_code_gen_buffer_splitwx(size, errp)) {
-            return true;
+        prot = alloc_code_gen_buffer_splitwx(size, errp);
+        if (prot >= 0) {
+            return prot;
         }
         /*
          * If splitwx force-on (1), fail;
          * if splitwx default-on (-1), fall through to splitwx off.
          */
         if (splitwx > 0) {
-            return false;
+            return -1;
         }
         error_free_or_abort(errp);
     }
@@ -844,11 +847,11 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
     size_t region_size;
     size_t i;
     uintptr_t splitwx_diff;
-    bool ok;
+    int have_prot;
 
-    ok = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
-                               splitwx, &error_fatal);
-    assert(ok);
+    have_prot = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
+                                      splitwx, &error_fatal);
+    assert(have_prot >= 0);
 
     /*
      * Make region_size a multiple of page_size, using aligned as the start.
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 23/29] tcg: Sink qemu_madvise call to common code
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (21 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 22/29] tcg: Return the map protection from alloc_code_gen_buffer Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 24/29] tcg: Do not set guard pages in the rx buffer Richard Henderson
                   ` (7 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Move the call out of the N versions of alloc_code_gen_buffer
and into tcg_region_init.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/region.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/tcg/region.c b/tcg/region.c
index 3ca0d01fa4..994c083343 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -554,7 +554,6 @@ static int alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
         error_setg_errno(errp, errno, "mprotect of jit buffer");
         return false;
     }
-    qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
 
     region.start_aligned = buf;
     region.total_size = size;
@@ -630,9 +629,6 @@ static int alloc_code_gen_buffer_anon(size_t size, int prot,
     }
 #endif
 
-    /* Request large pages for the buffer.  */
-    qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
-
     region.start_aligned = buf;
     region.total_size = size;
     return prot;
@@ -682,9 +678,6 @@ static bool alloc_code_gen_buffer_splitwx_memfd(size_t size, Error **errp)
     region.total_size = size;
     tcg_splitwx_diff = buf_rx - buf_rw;
 
-    /* Request large pages for the buffer and the splitwx.  */
-    qemu_madvise(buf_rw, size, QEMU_MADV_HUGEPAGE);
-    qemu_madvise(buf_rx, size, QEMU_MADV_HUGEPAGE);
     return PROT_READ | PROT_WRITE;
 
  fail_rx:
@@ -853,6 +846,13 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
                                       splitwx, &error_fatal);
     assert(have_prot >= 0);
 
+    /* Request large pages for the buffer and the splitwx.  */
+    qemu_madvise(region.start_aligned, region.total_size, QEMU_MADV_HUGEPAGE);
+    if (tcg_splitwx_diff) {
+        qemu_madvise(region.start_aligned + tcg_splitwx_diff,
+                     region.total_size, QEMU_MADV_HUGEPAGE);
+    }
+
     /*
      * Make region_size a multiple of page_size, using aligned as the start.
      * As a result of this we might end up with a few extra pages at the end of
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 24/29] tcg: Do not set guard pages in the rx buffer
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (22 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 23/29] tcg: Sink qemu_madvise call to common code Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 25/29] util/osdep: Add qemu_mprotect_rw Richard Henderson
                   ` (6 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

We only need guard pages in the rw buffer to avoid buffer overruns.
Let the rx buffer keep large pages all the way through.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/region.c | 8 +-------
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/tcg/region.c b/tcg/region.c
index 994c083343..27a7e35c8e 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -839,7 +839,6 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
     size_t page_size;
     size_t region_size;
     size_t i;
-    uintptr_t splitwx_diff;
     int have_prot;
 
     have_prot = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
@@ -881,8 +880,7 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
     /* init the region struct */
     qemu_mutex_init(&region.lock);
 
-    /* set guard pages */
-    splitwx_diff = tcg_splitwx_diff;
+    /* Set guard pages.  No need to do this for the rx_buf, only the rw_buf. */
     for (i = 0; i < region.n; i++) {
         void *start, *end;
         int rc;
@@ -890,10 +888,6 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
         tcg_region_bounds(i, &start, &end);
         rc = qemu_mprotect_none(end, page_size);
         g_assert(!rc);
-        if (splitwx_diff) {
-            rc = qemu_mprotect_none(end + splitwx_diff, page_size);
-            g_assert(!rc);
-        }
     }
 
     tcg_region_trees_init();
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 25/29] util/osdep: Add qemu_mprotect_rw
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (23 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 24/29] tcg: Do not set guard pages in the rx buffer Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 26/29] tcg: Round the tb_size default from qemu_get_host_physmem Richard Henderson
                   ` (5 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé

For --enable-tcg-interpreter on Windows, we will need this.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/qemu/osdep.h | 1 +
 util/osdep.c         | 9 +++++++++
 2 files changed, 10 insertions(+)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index ba15be9c56..5cc2e57bdf 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -494,6 +494,7 @@ void sigaction_invoke(struct sigaction *action,
 #endif
 
 int qemu_madvise(void *addr, size_t len, int advice);
+int qemu_mprotect_rw(void *addr, size_t size);
 int qemu_mprotect_rwx(void *addr, size_t size);
 int qemu_mprotect_none(void *addr, size_t size);
 
diff --git a/util/osdep.c b/util/osdep.c
index 66d01b9160..42a0a4986a 100644
--- a/util/osdep.c
+++ b/util/osdep.c
@@ -97,6 +97,15 @@ static int qemu_mprotect__osdep(void *addr, size_t size, int prot)
 #endif
 }
 
+int qemu_mprotect_rw(void *addr, size_t size)
+{
+#ifdef _WIN32
+    return qemu_mprotect__osdep(addr, size, PAGE_READWRITE);
+#else
+    return qemu_mprotect__osdep(addr, size, PROT_READ | PROT_WRITE);
+#endif
+}
+
 int qemu_mprotect_rwx(void *addr, size_t size)
 {
 #ifdef _WIN32
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 26/29] tcg: Round the tb_size default from qemu_get_host_physmem
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (24 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 25/29] util/osdep: Add qemu_mprotect_rw Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 27/29] tcg: Merge buffer protection and guard page protection Richard Henderson
                   ` (4 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

If qemu_get_host_physmem returns an odd number of pages,
then physmem / 8 will not be a multiple of the page size.

The following was observed on a gitlab runner:

ERROR qtest-arm/boot-serial-test - Bail out!
ERROR:../util/osdep.c:80:qemu_mprotect__osdep: \
  assertion failed: (!(size & ~qemu_real_host_page_mask))

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/region.c | 47 +++++++++++++++++++++--------------------------
 1 file changed, 21 insertions(+), 26 deletions(-)

diff --git a/tcg/region.c b/tcg/region.c
index 27a7e35c8e..4dc1237ff4 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -469,26 +469,6 @@ static size_t tcg_n_regions(size_t tb_size, unsigned max_cpus)
   (DEFAULT_CODE_GEN_BUFFER_SIZE_1 < MAX_CODE_GEN_BUFFER_SIZE \
    ? DEFAULT_CODE_GEN_BUFFER_SIZE_1 : MAX_CODE_GEN_BUFFER_SIZE)
 
-static size_t size_code_gen_buffer(size_t tb_size)
-{
-    /* Size the buffer.  */
-    if (tb_size == 0) {
-        size_t phys_mem = qemu_get_host_physmem();
-        if (phys_mem == 0) {
-            tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
-        } else {
-            tb_size = MIN(DEFAULT_CODE_GEN_BUFFER_SIZE, phys_mem / 8);
-        }
-    }
-    if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
-        tb_size = MIN_CODE_GEN_BUFFER_SIZE;
-    }
-    if (tb_size > MAX_CODE_GEN_BUFFER_SIZE) {
-        tb_size = MAX_CODE_GEN_BUFFER_SIZE;
-    }
-    return tb_size;
-}
-
 #ifdef __mips__
 /* In order to use J and JAL within the code_gen_buffer, we require
    that the buffer not cross a 256MB boundary.  */
@@ -836,13 +816,29 @@ static int alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
  */
 void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
 {
-    size_t page_size;
+    const size_t page_size = qemu_real_host_page_size;
     size_t region_size;
     size_t i;
     int have_prot;
 
-    have_prot = alloc_code_gen_buffer(size_code_gen_buffer(tb_size),
-                                      splitwx, &error_fatal);
+    /* Size the buffer.  */
+    if (tb_size == 0) {
+        size_t phys_mem = qemu_get_host_physmem();
+        if (phys_mem == 0) {
+            tb_size = DEFAULT_CODE_GEN_BUFFER_SIZE;
+        } else {
+            tb_size = QEMU_ALIGN_DOWN(phys_mem / 8, page_size);
+            tb_size = MIN(DEFAULT_CODE_GEN_BUFFER_SIZE, tb_size);
+        }
+    }
+    if (tb_size < MIN_CODE_GEN_BUFFER_SIZE) {
+        tb_size = MIN_CODE_GEN_BUFFER_SIZE;
+    }
+    if (tb_size > MAX_CODE_GEN_BUFFER_SIZE) {
+        tb_size = MAX_CODE_GEN_BUFFER_SIZE;
+    }
+
+    have_prot = alloc_code_gen_buffer(tb_size, splitwx, &error_fatal);
     assert(have_prot >= 0);
 
     /* Request large pages for the buffer and the splitwx.  */
@@ -857,9 +853,8 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
      * As a result of this we might end up with a few extra pages at the end of
      * the buffer; we will assign those to the last region.
      */
-    region.n = tcg_n_regions(region.total_size, max_cpus);
-    page_size = qemu_real_host_page_size;
-    region_size = region.total_size / region.n;
+    region.n = tcg_n_regions(tb_size, max_cpus);
+    region_size = tb_size / region.n;
     region_size = QEMU_ALIGN_DOWN(region_size, page_size);
 
     /* A region must have at least 2 pages; one code, one guard */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 27/29] tcg: Merge buffer protection and guard page protection
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (25 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 26/29] tcg: Round the tb_size default from qemu_get_host_physmem Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 28/29] tcg: When allocating for !splitwx, begin with PROT_NONE Richard Henderson
                   ` (3 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

Do not handle protections on a case-by-case basis in the
various alloc_code_gen_buffer instances; do it within a
single loop in tcg_region_init.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/region.c | 40 +++++++++++++++++++++++++++++-----------
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/tcg/region.c b/tcg/region.c
index 4dc1237ff4..fac416ebf5 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -530,11 +530,6 @@ static int alloc_code_gen_buffer(size_t tb_size, int splitwx, Error **errp)
     }
 #endif
 
-    if (qemu_mprotect_rwx(buf, size)) {
-        error_setg_errno(errp, errno, "mprotect of jit buffer");
-        return false;
-    }
-
     region.start_aligned = buf;
     region.total_size = size;
 
@@ -818,8 +813,7 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
 {
     const size_t page_size = qemu_real_host_page_size;
     size_t region_size;
-    size_t i;
-    int have_prot;
+    int have_prot, need_prot;
 
     /* Size the buffer.  */
     if (tb_size == 0) {
@@ -875,14 +869,38 @@ void tcg_region_init(size_t tb_size, int splitwx, unsigned max_cpus)
     /* init the region struct */
     qemu_mutex_init(&region.lock);
 
-    /* Set guard pages.  No need to do this for the rx_buf, only the rw_buf. */
-    for (i = 0; i < region.n; i++) {
+    /*
+     * Set guard pages.  No need to do this for the rx_buf, only the rw_buf.
+     * Work with the page protections set up with the initial mapping.
+     */
+    need_prot = PAGE_READ | PAGE_WRITE;
+#ifndef CONFIG_TCG_INTERPRETER
+    if (tcg_splitwx_diff == 0) {
+        need_prot |= PAGE_EXEC;
+    }
+#endif
+    for (size_t i = 0, n = region.n; i < n; i++) {
         void *start, *end;
         int rc;
 
         tcg_region_bounds(i, &start, &end);
-        rc = qemu_mprotect_none(end, page_size);
-        g_assert(!rc);
+        if (have_prot != need_prot) {
+            if (need_prot == (PAGE_READ | PAGE_WRITE | PAGE_EXEC)) {
+                rc = qemu_mprotect_rwx(start, end - start);
+            } else if (need_prot == (PAGE_READ | PAGE_WRITE)) {
+                rc = qemu_mprotect_rw(start, end - start);
+            } else {
+                g_assert_not_reached();
+            }
+            if (rc) {
+                error_setg_errno(&error_fatal, errno,
+                                 "mprotect of jit buffer");
+            }
+        }
+        if (have_prot != 0) {
+            /* If guard-page permissions don't change, it isn't fatal. */
+            (void)qemu_mprotect_none(end, page_size);
+        }
     }
 
     tcg_region_trees_init();
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 28/29] tcg: When allocating for !splitwx, begin with PROT_NONE
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (26 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 27/29] tcg: Merge buffer protection and guard page protection Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 21:27 ` [PATCH v2 29/29] tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/ Richard Henderson
                   ` (2 subsequent siblings)
  30 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j

There's a change in mprotect() behaviour [1] in the latest macOS
on M1 and it's not yet clear if it's going to be fixed by Apple.

In this case, instead of changing permissions of N guard pages,
we change permissions of N rwx regions.  The same number of
syscalls are required either way.

[1] https://gist.github.com/hikalium/75ae822466ee4da13cbbe486498a191f

Buglink: https://bugs.launchpad.net/qemu/+bug/1914849
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/region.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/tcg/region.c b/tcg/region.c
index fac416ebf5..53f78965c7 100644
--- a/tcg/region.c
+++ b/tcg/region.c
@@ -765,12 +765,15 @@ static int alloc_code_gen_buffer(size_t size, int splitwx, Error **errp)
         error_free_or_abort(errp);
     }
 
-    prot = PROT_READ | PROT_WRITE | PROT_EXEC;
+    /*
+     * macOS 11.2 has a bug (Apple Feedback FB8994773) in which mprotect
+     * rejects a permission change from RWX -> NONE when reserving the
+     * guard pages later.  We can go the other way with the same number
+     * of syscalls, so always begin with PROT_NONE.
+     */
+    prot = PROT_NONE;
     flags = MAP_PRIVATE | MAP_ANONYMOUS;
-#ifdef CONFIG_TCG_INTERPRETER
-    /* The tcg interpreter does not need execute permission. */
-    prot = PROT_READ | PROT_WRITE;
-#elif defined(CONFIG_DARWIN)
+#ifdef CONFIG_DARWIN
     /* Applicable to both iOS and macOS (Apple Silicon). */
     if (!splitwx) {
         flags |= MAP_JIT;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v2 29/29] tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (27 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 28/29] tcg: When allocating for !splitwx, begin with PROT_NONE Richard Henderson
@ 2021-03-14 21:27 ` Richard Henderson
  2021-03-14 22:00   ` Philippe Mathieu-Daudé
  2021-03-14 22:12 ` [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug no-reply
  2021-03-15 23:08 ` Roman Bolshakov
  30 siblings, 1 reply; 38+ messages in thread
From: Richard Henderson @ 2021-03-14 21:27 UTC (permalink / raw)
  To: qemu-devel; +Cc: r.bolshakov, j, Philippe Mathieu-Daudé

These variables belong to the jit side, not the user side.

Since tcg_init_ctx is no longer used outside of tcg/, move
the declaration to tcg/internal.h.

Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/tcg/tcg.h         | 1 -
 tcg/internal.h            | 1 +
 accel/tcg/translate-all.c | 3 ---
 tcg/tcg.c                 | 3 +++
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index a19deb529f..eef8857cca 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -690,7 +690,6 @@ static inline bool temp_readonly(TCGTemp *ts)
     return ts->kind >= TEMP_FIXED;
 }
 
-extern TCGContext tcg_init_ctx;
 extern __thread TCGContext *tcg_ctx;
 extern const void *tcg_code_gen_epilogue;
 extern uintptr_t tcg_splitwx_diff;
diff --git a/tcg/internal.h b/tcg/internal.h
index f9906523da..181f86507a 100644
--- a/tcg/internal.h
+++ b/tcg/internal.h
@@ -27,6 +27,7 @@
 
 #define TCG_HIGHWATER 1024
 
+extern TCGContext tcg_init_ctx;
 extern TCGContext **tcg_ctxs;
 extern unsigned int tcg_cur_ctxs;
 extern unsigned int tcg_max_ctxs;
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 40aeecf611..b32760c253 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -218,9 +218,6 @@ static int v_l2_levels;
 
 static void *l1_map[V_L1_MAX_SIZE];
 
-/* code generation context */
-TCGContext tcg_init_ctx;
-__thread TCGContext *tcg_ctx;
 TBContext tb_ctx;
 
 static void page_table_config_init(void)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 65f9cf01d5..77335fb60f 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -149,6 +149,9 @@ static int tcg_target_const_match(tcg_target_long val, TCGType type,
 static int tcg_out_ldst_finalize(TCGContext *s);
 #endif
 
+TCGContext tcg_init_ctx;
+__thread TCGContext *tcg_ctx;
+
 TCGContext **tcg_ctxs;
 unsigned int tcg_cur_ctxs;
 unsigned int tcg_max_ctxs;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 29/29] tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/
  2021-03-14 21:27 ` [PATCH v2 29/29] tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/ Richard Henderson
@ 2021-03-14 22:00   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 38+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-03-14 22:00 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: r.bolshakov, j

On 3/14/21 10:27 PM, Richard Henderson wrote:
> These variables belong to the jit side, not the user side.
> 
> Since tcg_init_ctx is no longer used outside of tcg/, move
> the declaration to tcg/internal.h.
> 
> Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  include/tcg/tcg.h         | 1 -
>  tcg/internal.h            | 1 +
>  accel/tcg/translate-all.c | 3 ---
>  tcg/tcg.c                 | 3 +++
>  4 files changed, 4 insertions(+), 4 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 22/29] tcg: Return the map protection from alloc_code_gen_buffer
  2021-03-14 21:27 ` [PATCH v2 22/29] tcg: Return the map protection from alloc_code_gen_buffer Richard Henderson
@ 2021-03-14 22:04   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 38+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-03-14 22:04 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: r.bolshakov, j

On 3/14/21 10:27 PM, Richard Henderson wrote:
> Change the interface from a boolean error indication to a
> negative error vs a non-negative protection.  For the moment
> this is only interface change, not making use of the new data.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/region.c | 63 +++++++++++++++++++++++++++-------------------------
>  1 file changed, 33 insertions(+), 30 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (28 preceding siblings ...)
  2021-03-14 21:27 ` [PATCH v2 29/29] tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/ Richard Henderson
@ 2021-03-14 22:12 ` no-reply
  2021-03-15 23:08 ` Roman Bolshakov
  30 siblings, 0 replies; 38+ messages in thread
From: no-reply @ 2021-03-14 22:12 UTC (permalink / raw)
  To: richard.henderson; +Cc: r.bolshakov, qemu-devel, j

Patchew URL: https://patchew.org/QEMU/20210314212724.1917075-1-richard.henderson@linaro.org/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210314212724.1917075-1-richard.henderson@linaro.org
Subject: [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 - [tag update]      patchew/20210314163927.1184-1-peter.maydell@linaro.org -> patchew/20210314163927.1184-1-peter.maydell@linaro.org
 * [new tag]         patchew/20210314212724.1917075-1-richard.henderson@linaro.org -> patchew/20210314212724.1917075-1-richard.henderson@linaro.org
Switched to a new branch 'test'
9906c07 tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/
ef1e2c0 tcg: When allocating for !splitwx, begin with PROT_NONE
76e12ad tcg: Merge buffer protection and guard page protection
1e9c899 tcg: Round the tb_size default from qemu_get_host_physmem
4bdfded util/osdep: Add qemu_mprotect_rw
a1751c5 tcg: Do not set guard pages in the rx buffer
40483ad tcg: Sink qemu_madvise call to common code
856c724 tcg: Return the map protection from alloc_code_gen_buffer
7622097 tcg: Allocate code_gen_buffer into struct tcg_region_state
251d71e tcg: Move in_code_gen_buffer and tests to region.c
a6a064d tcg: Tidy split_cross_256mb
af03a0d tcg: Tidy tcg_n_regions
218436d tcg: Rename region.start to region.after_prologue
9f3981e tcg: Replace region.end with region.total_size
276ecb9 tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h
683f5af tcg: Introduce tcg_max_ctxs
d7bf2f6 accel/tcg: Pass down max_cpus to tcg_init
a1cd412 accel/tcg: Merge tcg_exec_init into tcg_init_machine
4940162 tcg: Create tcg_init
4ab59ad accel/tcg: Rename tcg_init to tcg_init_machine
e27bd38 accel/tcg: Move alloc_code_gen_buffer to tcg/region.c
d4c3608 accel/tcg: Inline cpu_gen_init
2245d5c tcg: Split out region.c
a284234 tcg: Split out tcg_region_prologue_set
d116828 tcg: Split out tcg_region_initial_alloc
c75ce79 tcg: Remove error return from tcg_region_initial_alloc__locked
0df4d6c tcg: Re-order tcg_region_init vs tcg_prologue_init
cc0f7f7 meson: Split out fpu/meson.build
b0a2113 meson: Split out tcg/meson.build

=== OUTPUT BEGIN ===
1/29 Checking commit b0a211318ba3 (meson: Split out tcg/meson.build)
Use of uninitialized value $acpi_testexpected in string eq at ./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#44: 
new file mode 100644

total: 0 errors, 1 warnings, 35 lines checked

Patch 1/29 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
2/29 Checking commit cc0f7f7fdc1a (meson: Split out fpu/meson.build)
Use of uninitialized value $acpi_testexpected in string eq at ./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#16: 
new file mode 100644

total: 0 errors, 1 warnings, 17 lines checked

Patch 2/29 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
3/29 Checking commit 0df4d6c7ee67 (tcg: Re-order tcg_region_init vs tcg_prologue_init)
4/29 Checking commit c75ce79d8926 (tcg: Remove error return from tcg_region_initial_alloc__locked)
5/29 Checking commit d116828491cc (tcg: Split out tcg_region_initial_alloc)
6/29 Checking commit a284234d3909 (tcg: Split out tcg_region_prologue_set)
7/29 Checking commit 2245d5c83ec4 (tcg: Split out region.c)
Use of uninitialized value $acpi_testexpected in string eq at ./scripts/checkpatch.pl line 1529.
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#17: 
new file mode 100644

total: 0 errors, 1 warnings, 1189 lines checked

Patch 7/29 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
8/29 Checking commit d4c36080e021 (accel/tcg: Inline cpu_gen_init)
9/29 Checking commit e27bd38f652c (accel/tcg: Move alloc_code_gen_buffer to tcg/region.c)
WARNING: Block comments use a leading /* on a separate line
#499: FILE: tcg/region.c:411:
+/* Minimum size of the code gen buffer.  This number is randomly chosen,

WARNING: Block comments use * on subsequent lines
#500: FILE: tcg/region.c:412:
+/* Minimum size of the code gen buffer.  This number is randomly chosen,
+   but not so small that we can't have a fair number of TB's live.  */

WARNING: Block comments use a trailing */ on a separate line
#500: FILE: tcg/region.c:412:
+   but not so small that we can't have a fair number of TB's live.  */

WARNING: Block comments use a leading /* on a separate line
#503: FILE: tcg/region.c:415:
+/* Maximum size of the code gen buffer we'd like to use.  Unless otherwise

WARNING: Block comments use * on subsequent lines
#504: FILE: tcg/region.c:416:
+/* Maximum size of the code gen buffer we'd like to use.  Unless otherwise
+   indicated, this is constrained by the range of direct branches on the

WARNING: Block comments use a trailing */ on a separate line
#505: FILE: tcg/region.c:417:
+   host cpu, as used by the TCG implementation of goto_tb.  */

WARNING: architecture specific defines should be avoided
#506: FILE: tcg/region.c:418:
+#if defined(__x86_64__)

WARNING: Block comments use a leading /* on a separate line
#520: FILE: tcg/region.c:432:
+  /* We have a 256MB branch region, but leave room to make sure the

WARNING: Block comments use * on subsequent lines
#521: FILE: tcg/region.c:433:
+  /* We have a 256MB branch region, but leave room to make sure the
+     main executable is also within that region.  */

WARNING: Block comments use a trailing */ on a separate line
#521: FILE: tcg/region.c:433:
+     main executable is also within that region.  */

WARNING: architecture specific defines should be avoided
#579: FILE: tcg/region.c:491:
+#ifdef __mips__

WARNING: Block comments use a leading /* on a separate line
#580: FILE: tcg/region.c:492:
+/* In order to use J and JAL within the code_gen_buffer, we require

WARNING: Block comments use * on subsequent lines
#581: FILE: tcg/region.c:493:
+/* In order to use J and JAL within the code_gen_buffer, we require
+   that the buffer not cross a 256MB boundary.  */

WARNING: Block comments use a trailing */ on a separate line
#581: FILE: tcg/region.c:493:
+   that the buffer not cross a 256MB boundary.  */

WARNING: Block comments use a leading /* on a separate line
#587: FILE: tcg/region.c:499:
+/* We weren't able to allocate a buffer without crossing that boundary,

WARNING: Block comments use * on subsequent lines
#588: FILE: tcg/region.c:500:
+/* We weren't able to allocate a buffer without crossing that boundary,
+   so make do with the larger portion of the buffer that doesn't cross.

WARNING: Block comments use a trailing */ on a separate line
#589: FILE: tcg/region.c:501:
+   Returns the new base of the buffer, and adjusts code_gen_buffer_size.  */

WARNING: architecture specific defines should be avoided
#634: FILE: tcg/region.c:546:
+#ifdef __mips__

WARNING: architecture specific defines should be avoided
#686: FILE: tcg/region.c:598:
+#ifdef __mips__

WARNING: architecture specific defines should be avoided
#736: FILE: tcg/region.c:648:
+#ifdef __mips__

WARNING: architecture specific defines should be avoided
#753: FILE: tcg/region.c:665:
+#ifdef __mips__

ERROR: externs should be avoided in .c files
#795: FILE: tcg/region.c:707:
+extern kern_return_t mach_vm_remap(vm_map_t target_task,

total: 1 errors, 21 warnings, 895 lines checked

Patch 9/29 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

10/29 Checking commit 4ab59ad7fbac (accel/tcg: Rename tcg_init to tcg_init_machine)
11/29 Checking commit 49401622f6bd (tcg: Create tcg_init)
12/29 Checking commit a1cd412ff253 (accel/tcg: Merge tcg_exec_init into tcg_init_machine)
WARNING: Block comments use a leading /* on a separate line
#56: FILE: accel/tcg/tcg-all.c:121:
+    /* There's no guest base to take into account, so go ahead and

WARNING: Block comments use * on subsequent lines
#57: FILE: accel/tcg/tcg-all.c:122:
+    /* There's no guest base to take into account, so go ahead and
+       initialize the prologue now.  */

WARNING: Block comments use a trailing */ on a separate line
#57: FILE: accel/tcg/tcg-all.c:122:
+       initialize the prologue now.  */

total: 0 errors, 3 warnings, 81 lines checked

Patch 12/29 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
13/29 Checking commit d7bf2f6c5b84 (accel/tcg: Pass down max_cpus to tcg_init)
14/29 Checking commit 683f5af79dd7 (tcg: Introduce tcg_max_ctxs)
15/29 Checking commit 276ecb9d18cb (tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h)
16/29 Checking commit 9f3981e89b60 (tcg: Replace region.end with region.total_size)
17/29 Checking commit 218436d137c1 (tcg: Rename region.start to region.after_prologue)
18/29 Checking commit af03a0d81294 (tcg: Tidy tcg_n_regions)
19/29 Checking commit a6a064d88d21 (tcg: Tidy split_cross_256mb)
20/29 Checking commit 251d71e63d7f (tcg: Move in_code_gen_buffer and tests to region.c)
21/29 Checking commit 76220971b8cc (tcg: Allocate code_gen_buffer into struct tcg_region_state)
22/29 Checking commit 856c72493829 (tcg: Return the map protection from alloc_code_gen_buffer)
23/29 Checking commit 40483adb7b2e (tcg: Sink qemu_madvise call to common code)
24/29 Checking commit a1751c559ba8 (tcg: Do not set guard pages in the rx buffer)
25/29 Checking commit 4bdfded6d21a (util/osdep: Add qemu_mprotect_rw)
26/29 Checking commit 1e9c89999f44 (tcg: Round the tb_size default from qemu_get_host_physmem)
27/29 Checking commit 76e12ad880b1 (tcg: Merge buffer protection and guard page protection)
28/29 Checking commit ef1e2c0e7aed (tcg: When allocating for !splitwx, begin with PROT_NONE)
29/29 Checking commit 9906c07d1a1e (tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20210314212724.1917075-1-richard.henderson@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug
  2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
                   ` (29 preceding siblings ...)
  2021-03-14 22:12 ` [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug no-reply
@ 2021-03-15 23:08 ` Roman Bolshakov
  30 siblings, 0 replies; 38+ messages in thread
From: Roman Bolshakov @ 2021-03-15 23:08 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, j

On Sun, Mar 14, 2021 at 03:26:55PM -0600, Richard Henderson wrote:
> Changes for v2:
>   * Move tcg_init_ctx someplace more private (patch 29)
>   * Round result of tb_size based on qemu_get_host_physmem (patch 26)
> 
> Blurb for v1:
>   It took a few more patches than imagined to unify the two
>   places in which we manipulate the tcg code_gen buffer, but
>   the result is surely cleaner.
> 
>   There's a lot more that could be done to clean up this part
>   of tcg too.  I tried to not get too side-tracked, but didn't
>   wholly succeed.
> 
> 

Hi Richard,

Thanks for doing the changes!
I'm not sure if I'll find enough time for thorough review but the series
helps qemu on Big Sur 11.2.3, so:

Tested-by: Roman Bolshakov <r.bolshakov@yadro.com>

Regards,
Roman

> r~
> 
> 
> Richard Henderson (29):
>   meson: Split out tcg/meson.build
>   meson: Split out fpu/meson.build
>   tcg: Re-order tcg_region_init vs tcg_prologue_init
>   tcg: Remove error return from tcg_region_initial_alloc__locked
>   tcg: Split out tcg_region_initial_alloc
>   tcg: Split out tcg_region_prologue_set
>   tcg: Split out region.c
>   accel/tcg: Inline cpu_gen_init
>   accel/tcg: Move alloc_code_gen_buffer to tcg/region.c
>   accel/tcg: Rename tcg_init to tcg_init_machine
>   tcg: Create tcg_init
>   accel/tcg: Merge tcg_exec_init into tcg_init_machine
>   accel/tcg: Pass down max_cpus to tcg_init
>   tcg: Introduce tcg_max_ctxs
>   tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h
>   tcg: Replace region.end with region.total_size
>   tcg: Rename region.start to region.after_prologue
>   tcg: Tidy tcg_n_regions
>   tcg: Tidy split_cross_256mb
>   tcg: Move in_code_gen_buffer and tests to region.c
>   tcg: Allocate code_gen_buffer into struct tcg_region_state
>   tcg: Return the map protection from alloc_code_gen_buffer
>   tcg: Sink qemu_madvise call to common code
>   tcg: Do not set guard pages in the rx buffer
>   util/osdep: Add qemu_mprotect_rw
>   tcg: Round the tb_size default from qemu_get_host_physmem
>   tcg: Merge buffer protection and guard page protection
>   tcg: When allocating for !splitwx, begin with PROT_NONE
>   tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/
> 
>  meson.build               |  13 +-
>  accel/tcg/internal.h      |   2 +
>  include/qemu/osdep.h      |   1 +
>  include/sysemu/tcg.h      |   2 -
>  include/tcg/tcg.h         |  15 +-
>  tcg/aarch64/tcg-target.h  |   1 +
>  tcg/arm/tcg-target.h      |   1 +
>  tcg/i386/tcg-target.h     |   2 +
>  tcg/internal.h            |  40 ++
>  tcg/mips/tcg-target.h     |   6 +
>  tcg/ppc/tcg-target.h      |   2 +
>  tcg/riscv/tcg-target.h    |   1 +
>  tcg/s390/tcg-target.h     |   3 +
>  tcg/sparc/tcg-target.h    |   1 +
>  tcg/tci/tcg-target.h      |   1 +
>  accel/tcg/tcg-all.c       |  33 +-
>  accel/tcg/translate-all.c | 439 +----------------
>  bsd-user/main.c           |   1 -
>  linux-user/main.c         |   1 -
>  tcg/region.c              | 991 ++++++++++++++++++++++++++++++++++++++
>  tcg/tcg.c                 | 634 ++----------------------
>  util/osdep.c              |   9 +
>  fpu/meson.build           |   1 +
>  tcg/meson.build           |  14 +
>  24 files changed, 1139 insertions(+), 1075 deletions(-)
>  create mode 100644 tcg/internal.h
>  create mode 100644 tcg/region.c
>  create mode 100644 fpu/meson.build
>  create mode 100644 tcg/meson.build
> 
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 01/29] meson: Split out tcg/meson.build
  2021-03-14 21:26 ` [PATCH v2 01/29] meson: Split out tcg/meson.build Richard Henderson
@ 2021-03-15 23:09   ` Roman Bolshakov
  0 siblings, 0 replies; 38+ messages in thread
From: Roman Bolshakov @ 2021-03-15 23:09 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Philippe Mathieu-Daudé, qemu-devel, j

On Sun, Mar 14, 2021 at 03:26:56PM -0600, Richard Henderson wrote:
> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com>

Thanks,
Roman

>  meson.build     |  9 ++-------
>  tcg/meson.build | 13 +++++++++++++
>  2 files changed, 15 insertions(+), 7 deletions(-)
>  create mode 100644 tcg/meson.build
> 
> diff --git a/meson.build b/meson.build
> index a7d2dd429d..742f45c8d8 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1936,14 +1936,8 @@ specific_ss.add(files('cpu.c', 'disas.c', 'gdbstub.c'), capstone)
>  specific_ss.add(files('exec-vary.c'))
>  specific_ss.add(when: 'CONFIG_TCG', if_true: files(
>    'fpu/softfloat.c',
> -  'tcg/optimize.c',
> -  'tcg/tcg-common.c',
> -  'tcg/tcg-op-gvec.c',
> -  'tcg/tcg-op-vec.c',
> -  'tcg/tcg-op.c',
> -  'tcg/tcg.c',
>  ))
> -specific_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('disas/tci.c', 'tcg/tci.c'))
> +specific_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('disas/tci.c'))
>  
>  subdir('backends')
>  subdir('disas')
> @@ -1953,6 +1947,7 @@ subdir('net')
>  subdir('replay')
>  subdir('semihosting')
>  subdir('hw')
> +subdir('tcg')
>  subdir('accel')
>  subdir('plugins')
>  subdir('bsd-user')
> diff --git a/tcg/meson.build b/tcg/meson.build
> new file mode 100644
> index 0000000000..84064a341e
> --- /dev/null
> +++ b/tcg/meson.build
> @@ -0,0 +1,13 @@
> +tcg_ss = ss.source_set()
> +
> +tcg_ss.add(files(
> +  'optimize.c',
> +  'tcg.c',
> +  'tcg-common.c',
> +  'tcg-op.c',
> +  'tcg-op-gvec.c',
> +  'tcg-op-vec.c',
> +))
> +tcg_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('tci.c'))
> +
> +specific_ss.add_all(when: 'CONFIG_TCG', if_true: tcg_ss)
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 02/29] meson: Split out fpu/meson.build
  2021-03-14 21:26 ` [PATCH v2 02/29] meson: Split out fpu/meson.build Richard Henderson
@ 2021-03-15 23:10   ` Roman Bolshakov
  0 siblings, 0 replies; 38+ messages in thread
From: Roman Bolshakov @ 2021-03-15 23:10 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Philippe Mathieu-Daudé, qemu-devel, j

On Sun, Mar 14, 2021 at 03:26:57PM -0600, Richard Henderson wrote:
> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Roman Bolshakov <r.boshakov@yadro.com>

Thanks,
Roman

>  meson.build     | 4 +---
>  fpu/meson.build | 1 +
>  2 files changed, 2 insertions(+), 3 deletions(-)
>  create mode 100644 fpu/meson.build
> 
> diff --git a/meson.build b/meson.build
> index 742f45c8d8..bfa24b836e 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1934,9 +1934,6 @@ subdir('softmmu')
>  common_ss.add(capstone)
>  specific_ss.add(files('cpu.c', 'disas.c', 'gdbstub.c'), capstone)
>  specific_ss.add(files('exec-vary.c'))
> -specific_ss.add(when: 'CONFIG_TCG', if_true: files(
> -  'fpu/softfloat.c',
> -))
>  specific_ss.add(when: 'CONFIG_TCG_INTERPRETER', if_true: files('disas/tci.c'))
>  
>  subdir('backends')
> @@ -1948,6 +1945,7 @@ subdir('replay')
>  subdir('semihosting')
>  subdir('hw')
>  subdir('tcg')
> +subdir('fpu')
>  subdir('accel')
>  subdir('plugins')
>  subdir('bsd-user')
> diff --git a/fpu/meson.build b/fpu/meson.build
> new file mode 100644
> index 0000000000..1a9992ded5
> --- /dev/null
> +++ b/fpu/meson.build
> @@ -0,0 +1 @@
> +specific_ss.add(when: 'CONFIG_TCG', if_true: files('softfloat.c'))
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init
  2021-03-14 21:26 ` [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init Richard Henderson
@ 2021-03-15 23:37   ` Roman Bolshakov
  2021-03-16 14:57     ` Richard Henderson
  0 siblings, 1 reply; 38+ messages in thread
From: Roman Bolshakov @ 2021-03-15 23:37 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel, j

On Sun, Mar 14, 2021 at 03:26:58PM -0600, Richard Henderson wrote:
> Instead of delaying tcg_region_init until after tcg_prologue_init
> is complete, do tcg_region_init first and let tcg_prologue_init
> shrink the first region by the size of the generated prologue.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  accel/tcg/tcg-all.c       | 11 ---------
>  accel/tcg/translate-all.c |  3 +++
>  bsd-user/main.c           |  1 -
>  linux-user/main.c         |  1 -
>  tcg/tcg.c                 | 52 ++++++++++++++-------------------------
>  5 files changed, 22 insertions(+), 46 deletions(-)
> 
> diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
> index e378c2db73..f132033999 100644
> --- a/accel/tcg/tcg-all.c
> +++ b/accel/tcg/tcg-all.c
> @@ -111,17 +111,6 @@ static int tcg_init(MachineState *ms)
>  
>      tcg_exec_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
>      mttcg_enabled = s->mttcg_enabled;
> -
> -    /*
> -     * Initialize TCG regions only for softmmu.
> -     *
> -     * This needs to be done later for user mode, because the prologue
> -     * generation needs to be delayed so that GUEST_BASE is already set.
> -     */
> -#ifndef CONFIG_USER_ONLY
> -    tcg_region_init();

Note that tcg_region_init() invokes tcg_n_regions() that depends on
qemu_tcg_mttcg_enabled() that evaluates mttcg_enabled. Likely you need
to move "mttcg_enabled = s->mttcg_enabled;" before tcg_exec_init() to
keep existing behaviour.

> -#endif /* !CONFIG_USER_ONLY */
> -
>      return 0;
>  }
>  
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index f32df8b240..b9057567f4 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -1339,6 +1339,9 @@ void tcg_exec_init(unsigned long tb_size, int splitwx)
>                                 splitwx, &error_fatal);
>      assert(ok);
>  
> +    /* TODO: allocating regions is hand-in-glove with code_gen_buffer. */
> +    tcg_region_init();
> +
>  #if defined(CONFIG_SOFTMMU)
>      /* There's no guest base to take into account, so go ahead and
>         initialize the prologue now.  */
> diff --git a/bsd-user/main.c b/bsd-user/main.c
> index 798aba512c..3669d2b89e 100644
> --- a/bsd-user/main.c
> +++ b/bsd-user/main.c
> @@ -994,7 +994,6 @@ int main(int argc, char **argv)
>         generating the prologue until now so that the prologue can take
>         the real value of GUEST_BASE into account.  */
>      tcg_prologue_init(tcg_ctx);
> -    tcg_region_init();
>  
>      /* build Task State */
>      memset(ts, 0, sizeof(TaskState));
> diff --git a/linux-user/main.c b/linux-user/main.c
> index 4f4746dce8..1bc48ca954 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -850,7 +850,6 @@ int main(int argc, char **argv, char **envp)
>         generating the prologue until now so that the prologue can take
>         the real value of GUEST_BASE into account.  */
>      tcg_prologue_init(tcg_ctx);
> -    tcg_region_init();
>  
>      target_cpu_copy_regs(env, regs);
>  
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index 2991112829..0a2e5710de 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -1204,32 +1204,18 @@ TranslationBlock *tcg_tb_alloc(TCGContext *s)
>  
>  void tcg_prologue_init(TCGContext *s)
>  {
> -    size_t prologue_size, total_size;
> -    void *buf0, *buf1;
> +    size_t prologue_size;
>  
>      /* Put the prologue at the beginning of code_gen_buffer.  */
> -    buf0 = s->code_gen_buffer;
> -    total_size = s->code_gen_buffer_size;
> -    s->code_ptr = buf0;
> -    s->code_buf = buf0;
> +    tcg_region_assign(s, 0);
> +    s->code_ptr = s->code_gen_ptr;
> +    s->code_buf = s->code_gen_ptr;

Pardon me for asking a naive question, what's the difference between
s->code_buf and s->code_gen_buf and, respectively, s->code_ptr and
s->code_gen_ptr?

Thanks,
Roman

>      s->data_gen_ptr = NULL;
>  
> -    /*
> -     * The region trees are not yet configured, but tcg_splitwx_to_rx
> -     * needs the bounds for an assert.
> -     */
> -    region.start = buf0;
> -    region.end = buf0 + total_size;
> -
>  #ifndef CONFIG_TCG_INTERPRETER
> -    tcg_qemu_tb_exec = (tcg_prologue_fn *)tcg_splitwx_to_rx(buf0);
> +    tcg_qemu_tb_exec = (tcg_prologue_fn *)tcg_splitwx_to_rx(s->code_ptr);
>  #endif
>  
> -    /* Compute a high-water mark, at which we voluntarily flush the buffer
> -       and start over.  The size here is arbitrary, significantly larger
> -       than we expect the code generation for any one opcode to require.  */
> -    s->code_gen_highwater = s->code_gen_buffer + (total_size - TCG_HIGHWATER);
> -
>  #ifdef TCG_TARGET_NEED_POOL_LABELS
>      s->pool_labels = NULL;
>  #endif
> @@ -1246,32 +1232,32 @@ void tcg_prologue_init(TCGContext *s)
>      }
>  #endif
>  
> -    buf1 = s->code_ptr;
> +    prologue_size = tcg_current_code_size(s);
> +
>  #ifndef CONFIG_TCG_INTERPRETER
> -    flush_idcache_range((uintptr_t)tcg_splitwx_to_rx(buf0), (uintptr_t)buf0,
> -                        tcg_ptr_byte_diff(buf1, buf0));
> +    flush_idcache_range((uintptr_t)tcg_splitwx_to_rx(s->code_buf),
> +                        (uintptr_t)s->code_buf, prologue_size);
>  #endif
>  
> -    /* Deduct the prologue from the buffer.  */
> -    prologue_size = tcg_current_code_size(s);
> -    s->code_gen_ptr = buf1;
> -    s->code_gen_buffer = buf1;
> -    s->code_buf = buf1;
> -    total_size -= prologue_size;
> -    s->code_gen_buffer_size = total_size;
> +    /* Deduct the prologue from the first region.  */
> +    region.start = s->code_ptr;
>  
> -    tcg_register_jit(tcg_splitwx_to_rx(s->code_gen_buffer), total_size);
> +    /* Recompute boundaries of the first region. */
> +    tcg_region_assign(s, 0);
> +
> +    tcg_register_jit(tcg_splitwx_to_rx(region.start),
> +                     region.end - region.start);
>  
>  #ifdef DEBUG_DISAS
>      if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
>          FILE *logfile = qemu_log_lock();
>          qemu_log("PROLOGUE: [size=%zu]\n", prologue_size);
>          if (s->data_gen_ptr) {
> -            size_t code_size = s->data_gen_ptr - buf0;
> +            size_t code_size = s->data_gen_ptr - s->code_gen_ptr;
>              size_t data_size = prologue_size - code_size;
>              size_t i;
>  
> -            log_disas(buf0, code_size);
> +            log_disas(s->code_gen_ptr, code_size);
>  
>              for (i = 0; i < data_size; i += sizeof(tcg_target_ulong)) {
>                  if (sizeof(tcg_target_ulong) == 8) {
> @@ -1285,7 +1271,7 @@ void tcg_prologue_init(TCGContext *s)
>                  }
>              }
>          } else {
> -            log_disas(buf0, prologue_size);
> +            log_disas(s->code_gen_ptr, prologue_size);
>          }
>          qemu_log("\n");
>          qemu_log_flush();
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init
  2021-03-15 23:37   ` Roman Bolshakov
@ 2021-03-16 14:57     ` Richard Henderson
  0 siblings, 0 replies; 38+ messages in thread
From: Richard Henderson @ 2021-03-16 14:57 UTC (permalink / raw)
  To: Roman Bolshakov; +Cc: qemu-devel, j

On 3/15/21 5:37 PM, Roman Bolshakov wrote:
>>       tcg_exec_init(s->tb_size * 1024 * 1024, s->splitwx_enabled);
>>       mttcg_enabled = s->mttcg_enabled;
>> -
>> -    /*
>> -     * Initialize TCG regions only for softmmu.
>> -     *
>> -     * This needs to be done later for user mode, because the prologue
>> -     * generation needs to be delayed so that GUEST_BASE is already set.
>> -     */
>> -#ifndef CONFIG_USER_ONLY
>> -    tcg_region_init();
> 
> Note that tcg_region_init() invokes tcg_n_regions() that depends on
> qemu_tcg_mttcg_enabled() that evaluates mttcg_enabled. Likely you need
> to move "mttcg_enabled = s->mttcg_enabled;" before tcg_exec_init() to
> keep existing behaviour.

Yes indeed.  This gets fixed in patch 12, which is why I didn't notice 
breakage.  Will adjust.

>> -    total_size = s->code_gen_buffer_size;
>> -    s->code_ptr = buf0;
>> -    s->code_buf = buf0;
>> +    tcg_region_assign(s, 0);
>> +    s->code_ptr = s->code_gen_ptr;
>> +    s->code_buf = s->code_gen_ptr;
> 
> Pardon me for asking a naive question, what's the difference between
> s->code_buf and s->code_gen_buf and, respectively, s->code_ptr and
> s->code_gen_ptr?

I don't remember.  I actually had it in my mind to rename all of these, remove 
one or two that feel redundant, and document them all.  But the patch set was 
large enough already.


r~


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2021-03-16 14:58 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-14 21:26 [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug Richard Henderson
2021-03-14 21:26 ` [PATCH v2 01/29] meson: Split out tcg/meson.build Richard Henderson
2021-03-15 23:09   ` Roman Bolshakov
2021-03-14 21:26 ` [PATCH v2 02/29] meson: Split out fpu/meson.build Richard Henderson
2021-03-15 23:10   ` Roman Bolshakov
2021-03-14 21:26 ` [PATCH v2 03/29] tcg: Re-order tcg_region_init vs tcg_prologue_init Richard Henderson
2021-03-15 23:37   ` Roman Bolshakov
2021-03-16 14:57     ` Richard Henderson
2021-03-14 21:26 ` [PATCH v2 04/29] tcg: Remove error return from tcg_region_initial_alloc__locked Richard Henderson
2021-03-14 21:27 ` [PATCH v2 05/29] tcg: Split out tcg_region_initial_alloc Richard Henderson
2021-03-14 21:27 ` [PATCH v2 06/29] tcg: Split out tcg_region_prologue_set Richard Henderson
2021-03-14 21:27 ` [PATCH v2 07/29] tcg: Split out region.c Richard Henderson
2021-03-14 21:27 ` [PATCH v2 08/29] accel/tcg: Inline cpu_gen_init Richard Henderson
2021-03-14 21:27 ` [PATCH v2 09/29] accel/tcg: Move alloc_code_gen_buffer to tcg/region.c Richard Henderson
2021-03-14 21:27 ` [PATCH v2 10/29] accel/tcg: Rename tcg_init to tcg_init_machine Richard Henderson
2021-03-14 21:27 ` [PATCH v2 11/29] tcg: Create tcg_init Richard Henderson
2021-03-14 21:27 ` [PATCH v2 12/29] accel/tcg: Merge tcg_exec_init into tcg_init_machine Richard Henderson
2021-03-14 21:27 ` [PATCH v2 13/29] accel/tcg: Pass down max_cpus to tcg_init Richard Henderson
2021-03-14 21:27 ` [PATCH v2 14/29] tcg: Introduce tcg_max_ctxs Richard Henderson
2021-03-14 21:27 ` [PATCH v2 15/29] tcg: Move MAX_CODE_GEN_BUFFER_SIZE to tcg-target.h Richard Henderson
2021-03-14 21:27 ` [PATCH v2 16/29] tcg: Replace region.end with region.total_size Richard Henderson
2021-03-14 21:27 ` [PATCH v2 17/29] tcg: Rename region.start to region.after_prologue Richard Henderson
2021-03-14 21:27 ` [PATCH v2 18/29] tcg: Tidy tcg_n_regions Richard Henderson
2021-03-14 21:27 ` [PATCH v2 19/29] tcg: Tidy split_cross_256mb Richard Henderson
2021-03-14 21:27 ` [PATCH v2 20/29] tcg: Move in_code_gen_buffer and tests to region.c Richard Henderson
2021-03-14 21:27 ` [PATCH v2 21/29] tcg: Allocate code_gen_buffer into struct tcg_region_state Richard Henderson
2021-03-14 21:27 ` [PATCH v2 22/29] tcg: Return the map protection from alloc_code_gen_buffer Richard Henderson
2021-03-14 22:04   ` Philippe Mathieu-Daudé
2021-03-14 21:27 ` [PATCH v2 23/29] tcg: Sink qemu_madvise call to common code Richard Henderson
2021-03-14 21:27 ` [PATCH v2 24/29] tcg: Do not set guard pages in the rx buffer Richard Henderson
2021-03-14 21:27 ` [PATCH v2 25/29] util/osdep: Add qemu_mprotect_rw Richard Henderson
2021-03-14 21:27 ` [PATCH v2 26/29] tcg: Round the tb_size default from qemu_get_host_physmem Richard Henderson
2021-03-14 21:27 ` [PATCH v2 27/29] tcg: Merge buffer protection and guard page protection Richard Henderson
2021-03-14 21:27 ` [PATCH v2 28/29] tcg: When allocating for !splitwx, begin with PROT_NONE Richard Henderson
2021-03-14 21:27 ` [PATCH v2 29/29] tcg: Move tcg_init_ctx and tcg_ctx from accel/tcg/ Richard Henderson
2021-03-14 22:00   ` Philippe Mathieu-Daudé
2021-03-14 22:12 ` [PATCH v2 00/29] tcg: Workaround macOS 11.2 mprotect bug no-reply
2021-03-15 23:08 ` Roman Bolshakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).