All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/10] tcg: breakpoint reorg
@ 2021-07-12 15:39 Richard Henderson
  2021-07-12 15:39 ` [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
                   ` (11 more replies)
  0 siblings, 12 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug

This is fixing #404 ("windows xp boot takes much longer...")
and several other similar reports.

For v2, all prerequisites and 7 of the patches from v1 with
reviews are now upstream.

Mark Cave-Ayland reported success with WinXP with v1, with
this patch set being even faster than b55f54bc~1.  Which was
a bit of a surprise, but I'll take it.  It means that it's
probably not worth making the breakpoint detection scheme
any more complicated.

I'd still like some more feedback.  Given this is fixing a
regression from qemu 5.2 I feel comfortable delaying this
past soft freeze, but not past hard freeze on the 20th.


r~


Richard Henderson (10):
  accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
  accel/tcg: Move curr_cflags into cpu-exec.c
  accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
  accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
  accel/tcg: Handle -singlestep in curr_cflags
  accel/tcg: Use CF_NO_GOTO_{TB,PTR} in cpu_exec_step_atomic
  accel/tcg: Move cflags lookup into tb_find
  accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
  accel/tcg: Hoist tb_cflags to a local in translator_loop
  accel/tcg: Encode breakpoint info into tb->cflags

 include/exec/exec-all.h       |  30 +++++---
 include/exec/translator.h     |  17 +++--
 accel/tcg/cpu-exec.c          | 130 ++++++++++++++++++++++++++++------
 accel/tcg/translate-all.c     |   7 +-
 accel/tcg/translator.c        |  79 ++++++++++++++-------
 cpu.c                         |  24 -------
 target/alpha/translate.c      |  12 +---
 target/arm/translate-a64.c    |  14 ++--
 target/arm/translate.c        |  20 +++---
 target/avr/translate.c        |   6 +-
 target/cris/translate.c       |  14 ++--
 target/hexagon/translate.c    |  13 +---
 target/hppa/translate.c       |   7 +-
 target/i386/tcg/translate.c   |  15 ++--
 target/m68k/translate.c       |  14 +---
 target/microblaze/translate.c |  14 +---
 target/mips/tcg/translate.c   |  14 ++--
 target/nios2/translate.c      |  13 +---
 target/openrisc/translate.c   |  11 +--
 target/ppc/translate.c        |  13 +---
 target/riscv/translate.c      |  11 +--
 target/rx/translate.c         |   8 +--
 target/s390x/translate.c      |  12 ++--
 target/sh4/translate.c        |  12 ++--
 target/sparc/translate.c      |   9 ++-
 target/tricore/translate.c    |  13 +---
 target/xtensa/translate.c     |  12 ++--
 tcg/tcg-op.c                  |  28 ++++----
 28 files changed, 280 insertions(+), 292 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
  2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
@ 2021-07-12 15:39 ` Richard Henderson
  2021-07-17 17:30   ` Peter Maydell
  2021-07-17 17:33   ` Alex Bennée
  2021-07-12 15:39 ` [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c Richard Henderson
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug

The space reserved for CF_COUNT_MASK was overly large.
Reduce to free up cflags bits and eliminate an extra test.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h   | 4 +++-
 accel/tcg/translate-all.c | 5 ++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 754f4130c9..dfe82ed19c 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -492,7 +492,9 @@ struct TranslationBlock {
     target_ulong cs_base; /* CS base for this block */
     uint32_t flags; /* flags defining in which context the code was generated */
     uint32_t cflags;    /* compile flags */
-#define CF_COUNT_MASK  0x00007fff
+
+/* Note that TCG_MAX_INSNS is 512; we validate this match elsewhere. */
+#define CF_COUNT_MASK  0x000001ff
 #define CF_LAST_IO     0x00008000 /* Last insn may be an IO access.  */
 #define CF_MEMI_ONLY   0x00010000 /* Only instrument memory ops */
 #define CF_USE_ICOUNT  0x00020000
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 4df26de858..997e44c73b 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1430,9 +1430,8 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
     }
-    if (max_insns > TCG_MAX_INSNS) {
-        max_insns = TCG_MAX_INSNS;
-    }
+    QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
+
     if (cpu->singlestep_enabled || singlestep) {
         max_insns = 1;
     }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c
  2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
  2021-07-12 15:39 ` [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
@ 2021-07-12 15:39 ` Richard Henderson
  2021-07-17 17:32   ` Peter Maydell
  2021-07-17 17:34   ` Alex Bennée
  2021-07-12 15:39 ` [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR Richard Henderson
                   ` (9 subsequent siblings)
  11 siblings, 2 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug

We will shortly have more than a simple member read here,
with stuff not necessarily exposed to exec/exec-all.h.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h | 5 +----
 accel/tcg/cpu-exec.c    | 5 +++++
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index dfe82ed19c..ae7603ca75 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -565,10 +565,7 @@ static inline uint32_t tb_cflags(const TranslationBlock *tb)
 }
 
 /* current cflags for hashing/comparison */
-static inline uint32_t curr_cflags(CPUState *cpu)
-{
-    return cpu->tcg_cflags;
-}
+uint32_t curr_cflags(CPUState *cpu);
 
 /* TranslationBlock invalidate API */
 #if defined(CONFIG_USER_ONLY)
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index e22bcb99f7..ef4214d893 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -145,6 +145,11 @@ static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
 }
 #endif /* CONFIG USER ONLY */
 
+uint32_t curr_cflags(CPUState *cpu)
+{
+    return cpu->tcg_cflags;
+}
+
 /* Might cause an exception, so have a longjmp destination ready */
 static inline TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
                                           target_ulong cs_base,
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
  2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
  2021-07-12 15:39 ` [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
  2021-07-12 15:39 ` [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c Richard Henderson
@ 2021-07-12 15:39 ` Richard Henderson
  2021-07-17 17:36   ` Alex Bennée
  2021-07-17 17:36   ` Peter Maydell
  2021-07-12 15:39 ` [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain Richard Henderson
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug

Move the -d nochain check to bits on tb->cflags.
These will be used for more than -d nochain shortly.

Set bits during curr_cflags, test them in translator_use_goto_tb,
assert we're not doing anything odd in tcg_gen_goto_tb.  The test
in tcg_gen_exit_tb is redundant with the assert for goto_tb_issue_mask.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h | 16 +++++++++-------
 accel/tcg/cpu-exec.c    |  8 +++++++-
 accel/tcg/translator.c  |  5 +++++
 tcg/tcg-op.c            | 28 ++++++++++++----------------
 4 files changed, 33 insertions(+), 24 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index ae7603ca75..6873cce8df 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -494,13 +494,15 @@ struct TranslationBlock {
     uint32_t cflags;    /* compile flags */
 
 /* Note that TCG_MAX_INSNS is 512; we validate this match elsewhere. */
-#define CF_COUNT_MASK  0x000001ff
-#define CF_LAST_IO     0x00008000 /* Last insn may be an IO access.  */
-#define CF_MEMI_ONLY   0x00010000 /* Only instrument memory ops */
-#define CF_USE_ICOUNT  0x00020000
-#define CF_INVALID     0x00040000 /* TB is stale. Set with @jmp_lock held */
-#define CF_PARALLEL    0x00080000 /* Generate code for a parallel context */
-#define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */
+#define CF_COUNT_MASK    0x000001ff
+#define CF_NO_GOTO_TB    0x00000200 /* Do not chain with goto_tb */
+#define CF_NO_GOTO_PTR   0x00000400 /* Do not chain with goto_ptr */
+#define CF_LAST_IO       0x00008000 /* Last insn may be an IO access.  */
+#define CF_MEMI_ONLY     0x00010000 /* Only instrument memory ops */
+#define CF_USE_ICOUNT    0x00020000
+#define CF_INVALID       0x00040000 /* TB is stale. Set with @jmp_lock held */
+#define CF_PARALLEL      0x00080000 /* Generate code for a parallel context */
+#define CF_CLUSTER_MASK  0xff000000 /* Top 8 bits are cluster ID */
 #define CF_CLUSTER_SHIFT 24
 
     /* Per-vCPU dynamic tracing state used to generate this TB */
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index ef4214d893..d3232d5764 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -147,7 +147,13 @@ static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
 
 uint32_t curr_cflags(CPUState *cpu)
 {
-    return cpu->tcg_cflags;
+    uint32_t cflags = cpu->tcg_cflags;
+
+    if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
+        cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
+    }
+
+    return cflags;
 }
 
 /* Might cause an exception, so have a longjmp destination ready */
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 59804af37b..2ea5a74f30 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -33,6 +33,11 @@ void translator_loop_temp_check(DisasContextBase *db)
 
 bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
 {
+    /* Suppress goto_tb if requested. */
+    if (tb_cflags(db->tb) & CF_NO_GOTO_TB) {
+        return false;
+    }
+
     /* Suppress goto_tb in the case of single-steping.  */
     if (db->singlestep_enabled || singlestep) {
         return false;
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 0c561fb253..e0d54d537f 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2723,10 +2723,6 @@ void tcg_gen_exit_tb(const TranslationBlock *tb, unsigned idx)
            seen this numbered exit before, via tcg_gen_goto_tb.  */
         tcg_debug_assert(tcg_ctx->goto_tb_issue_mask & (1 << idx));
 #endif
-        /* When not chaining, exit without indicating a link.  */
-        if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
-            val = 0;
-        }
     } else {
         /* This is an exit via the exitreq label.  */
         tcg_debug_assert(idx == TB_EXIT_REQUESTED);
@@ -2738,6 +2734,8 @@ void tcg_gen_exit_tb(const TranslationBlock *tb, unsigned idx)
 
 void tcg_gen_goto_tb(unsigned idx)
 {
+    /* We tested CF_NO_GOTO_TB in translator_use_goto_tb. */
+    tcg_debug_assert(!(tcg_ctx->tb_cflags & CF_NO_GOTO_TB));
     /* We only support two chained exits.  */
     tcg_debug_assert(idx <= TB_EXIT_IDXMAX);
 #ifdef CONFIG_DEBUG_TCG
@@ -2746,25 +2744,23 @@ void tcg_gen_goto_tb(unsigned idx)
     tcg_ctx->goto_tb_issue_mask |= 1 << idx;
 #endif
     plugin_gen_disable_mem_helpers();
-    /* When not chaining, we simply fall through to the "fallback" exit.  */
-    if (!qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
-        tcg_gen_op1i(INDEX_op_goto_tb, idx);
-    }
+    tcg_gen_op1i(INDEX_op_goto_tb, idx);
 }
 
 void tcg_gen_lookup_and_goto_ptr(void)
 {
-    if (!qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
-        TCGv_ptr ptr;
+    TCGv_ptr ptr;
 
-        plugin_gen_disable_mem_helpers();
-        ptr = tcg_temp_new_ptr();
-        gen_helper_lookup_tb_ptr(ptr, cpu_env);
-        tcg_gen_op1i(INDEX_op_goto_ptr, tcgv_ptr_arg(ptr));
-        tcg_temp_free_ptr(ptr);
-    } else {
+    if (tcg_ctx->tb_cflags & CF_NO_GOTO_PTR) {
         tcg_gen_exit_tb(NULL, 0);
+        return;
     }
+
+    plugin_gen_disable_mem_helpers();
+    ptr = tcg_temp_new_ptr();
+    gen_helper_lookup_tb_ptr(ptr, cpu_env);
+    tcg_gen_op1i(INDEX_op_goto_ptr, tcgv_ptr_arg(ptr));
+    tcg_temp_free_ptr(ptr);
 }
 
 static inline MemOp tcg_canonicalize_memop(MemOp op, bool is64, bool st)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
  2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
                   ` (2 preceding siblings ...)
  2021-07-12 15:39 ` [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR Richard Henderson
@ 2021-07-12 15:39 ` Richard Henderson
  2021-07-17 17:39   ` Peter Maydell
  2021-07-17 17:39   ` Alex Bennée
  2021-07-12 15:39 ` [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags Richard Henderson
                   ` (7 subsequent siblings)
  11 siblings, 2 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug

The purpose of suppressing goto_ptr from -d nochain had been
to return to the main loop so that -d cpu would be recognized.
But we now include -d cpu logging in helper_lookup_tb_ptr so
there is no need to exclude goto_ptr.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index d3232d5764..70ea3c7d68 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -150,7 +150,7 @@ uint32_t curr_cflags(CPUState *cpu)
     uint32_t cflags = cpu->tcg_cflags;
 
     if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
-        cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
+        cflags |= CF_NO_GOTO_TB;
     }
 
     return cflags;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags
  2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
                   ` (3 preceding siblings ...)
  2021-07-12 15:39 ` [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain Richard Henderson
@ 2021-07-12 15:39 ` Richard Henderson
  2021-07-17 17:41   ` Peter Maydell
  2021-07-17 17:42   ` Alex Bennée
  2021-07-12 15:40 ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic Richard Henderson
                   ` (6 subsequent siblings)
  11 siblings, 2 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug

Exchange the test in translator_use_goto_tb for CF_NO_GOTO_TB,
and the test in tb_gen_code for setting CF_COUNT_MASK to 1.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c      | 8 +++++++-
 accel/tcg/translate-all.c | 2 +-
 accel/tcg/translator.c    | 2 +-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 70ea3c7d68..2206c463f5 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -149,7 +149,13 @@ uint32_t curr_cflags(CPUState *cpu)
 {
     uint32_t cflags = cpu->tcg_cflags;
 
-    if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
+    /*
+     * For singlestep and -d nochain, suppress goto_tb so that
+     * we can log -d cpu,exec after every TB.
+     */
+    if (singlestep) {
+        cflags |= CF_NO_GOTO_TB | 1;
+    } else if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
         cflags |= CF_NO_GOTO_TB;
     }
 
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 997e44c73b..491c1a56b2 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1432,7 +1432,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     }
     QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
 
-    if (cpu->singlestep_enabled || singlestep) {
+    if (cpu->singlestep_enabled) {
         max_insns = 1;
     }
 
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 2ea5a74f30..a59eb7c11b 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -39,7 +39,7 @@ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
     }
 
     /* Suppress goto_tb in the case of single-steping.  */
-    if (db->singlestep_enabled || singlestep) {
+    if (db->singlestep_enabled) {
         return false;
     }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic
  2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
                   ` (4 preceding siblings ...)
  2021-07-12 15:39 ` [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags Richard Henderson
@ 2021-07-12 15:40 ` Richard Henderson
  2021-07-17 17:43   ` Peter Maydell
  2021-07-17 17:47   ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB,PTR} " Alex Bennée
  2021-07-12 15:40 ` [PATCH v2 07/10] accel/tcg: Move cflags lookup into tb_find Richard Henderson
                   ` (5 subsequent siblings)
  11 siblings, 2 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug

Request that the one TB returns immediately, so that
we release the exclusive lock as soon as possible.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 2206c463f5..5bb099174f 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -330,8 +330,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
     CPUArchState *env = (CPUArchState *)cpu->env_ptr;
     TranslationBlock *tb;
     target_ulong cs_base, pc;
-    uint32_t flags;
-    uint32_t cflags = (curr_cflags(cpu) & ~CF_PARALLEL) | 1;
+    uint32_t flags, cflags;
     int tb_exit;
 
     if (sigsetjmp(cpu->jmp_env, 0) == 0) {
@@ -341,8 +340,14 @@ void cpu_exec_step_atomic(CPUState *cpu)
         cpu->running = true;
 
         cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
-        tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
 
+        cflags = curr_cflags(cpu);
+        /* Execute in a serial context. */
+        cflags &= ~CF_PARALLEL;
+        /* After 1 insn, return and release the exclusive lock. */
+        cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR | 1;
+
+        tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
         if (tb == NULL) {
             mmap_lock();
             tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 07/10] accel/tcg: Move cflags lookup into tb_find
  2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
                   ` (5 preceding siblings ...)
  2021-07-12 15:40 ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic Richard Henderson
@ 2021-07-12 15:40 ` Richard Henderson
  2021-07-17 17:45   ` Peter Maydell
  2021-07-12 15:40 ` [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check Richard Henderson
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug

We will shortly require the guest pc for computing cflags,
so move the choice just after cpu_get_tb_cpu_state.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 5bb099174f..4d043a11aa 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -502,15 +502,29 @@ static inline void tb_add_jump(TranslationBlock *tb, int n,
 
 static inline TranslationBlock *tb_find(CPUState *cpu,
                                         TranslationBlock *last_tb,
-                                        int tb_exit, uint32_t cflags)
+                                        int tb_exit)
 {
     CPUArchState *env = (CPUArchState *)cpu->env_ptr;
     TranslationBlock *tb;
     target_ulong cs_base, pc;
-    uint32_t flags;
+    uint32_t flags, cflags;
 
     cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
 
+    /*
+     * When requested, use an exact setting for cflags for the next
+     * execution.  This is used for icount, precise smc, and stop-
+     * after-access watchpoints.  Since this request should never
+     * have CF_INVALID set, -1 is a convenient invalid value that
+     * does not require tcg headers for cpu_common_reset.
+     */
+    cflags = cpu->cflags_next_tb;
+    if (cflags == -1) {
+        cflags = curr_cflags(cpu);
+    } else {
+        cpu->cflags_next_tb = -1;
+    }
+
     tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
     if (tb == NULL) {
         mmap_lock();
@@ -868,21 +882,7 @@ int cpu_exec(CPUState *cpu)
         int tb_exit = 0;
 
         while (!cpu_handle_interrupt(cpu, &last_tb)) {
-            uint32_t cflags = cpu->cflags_next_tb;
-            TranslationBlock *tb;
-
-            /* When requested, use an exact setting for cflags for the next
-               execution.  This is used for icount, precise smc, and stop-
-               after-access watchpoints.  Since this request should never
-               have CF_INVALID set, -1 is a convenient invalid value that
-               does not require tcg headers for cpu_common_reset.  */
-            if (cflags == -1) {
-                cflags = curr_cflags(cpu);
-            } else {
-                cpu->cflags_next_tb = -1;
-            }
-
-            tb = tb_find(cpu, last_tb, tb_exit, cflags);
+            TranslationBlock *tb = tb_find(cpu, last_tb, tb_exit);
             cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit);
             /* Try to align the host and virtual clocks
                if the guest is in advance */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
  2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
                   ` (6 preceding siblings ...)
  2021-07-12 15:40 ` [PATCH v2 07/10] accel/tcg: Move cflags lookup into tb_find Richard Henderson
@ 2021-07-12 15:40 ` Richard Henderson
  2021-07-17 17:52   ` Peter Maydell
  2021-07-12 15:40 ` [PATCH v2 09/10] accel/tcg: Hoist tb_cflags to a local in translator_loop Richard Henderson
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug

We don't need the whole CPUBreakpoint structure in the check,
only the flags.  Return the instruction length to consolidate
the adjustment of db->pc_next.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/translator.h     | 17 +++++++++------
 accel/tcg/translator.c        | 40 ++++++++++++++++++++++++-----------
 target/alpha/translate.c      | 12 +++--------
 target/arm/translate-a64.c    | 14 ++++--------
 target/arm/translate.c        | 20 +++++++-----------
 target/avr/translate.c        |  6 +++---
 target/cris/translate.c       | 14 ++++--------
 target/hexagon/translate.c    | 13 +++---------
 target/hppa/translate.c       |  7 +++---
 target/i386/tcg/translate.c   | 15 ++++---------
 target/m68k/translate.c       | 14 +++---------
 target/microblaze/translate.c | 14 +++---------
 target/mips/tcg/translate.c   | 14 ++++--------
 target/nios2/translate.c      | 13 +++---------
 target/openrisc/translate.c   | 11 +++-------
 target/ppc/translate.c        | 13 +++---------
 target/riscv/translate.c      | 11 +++-------
 target/rx/translate.c         |  8 +++----
 target/s390x/translate.c      | 12 ++++-------
 target/sh4/translate.c        | 12 ++++-------
 target/sparc/translate.c      |  9 ++++----
 target/tricore/translate.c    | 13 +++---------
 target/xtensa/translate.c     | 12 ++++-------
 23 files changed, 115 insertions(+), 199 deletions(-)

diff --git a/include/exec/translator.h b/include/exec/translator.h
index dd9c06d40d..433b753c5c 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -92,11 +92,15 @@ typedef struct DisasContextBase {
  * @breakpoint_check:
  *      When called, the breakpoint has already been checked to match the PC,
  *      but the target may decide the breakpoint missed the address
- *      (e.g., due to conditions encoded in their flags).  Return true to
- *      indicate that the breakpoint did hit, in which case no more breakpoints
- *      are checked.  If the breakpoint did hit, emit any code required to
- *      signal the exception, and set db->is_jmp as necessary to terminate
- *      the main loop.
+ *      (e.g., due to conditions encoded in their flags), in which case
+ *      db->is_jmp may be left as DISAS_NEXT or DISAS_TOO_MANY to indicate
+ *      that the insn should be translated.  Anything other than those two
+ *      will be taken to indicate an exception has been raised, but in most
+ *      cases db->is_jmp should be set to DISAS_NORETURN.
+ *
+ *      Return the minimum instruction size that should be applied to the TB.
+ *      The size of any TB cannot be zero, as that breaks the math used to
+ *      invalidate TBs.
  *
  * @translate_insn:
  *      Disassemble one instruction and set db->pc_next for the start
@@ -113,8 +117,7 @@ typedef struct TranslatorOps {
     void (*init_disas_context)(DisasContextBase *db, CPUState *cpu);
     void (*tb_start)(DisasContextBase *db, CPUState *cpu);
     void (*insn_start)(DisasContextBase *db, CPUState *cpu);
-    bool (*breakpoint_check)(DisasContextBase *db, CPUState *cpu,
-                             const CPUBreakpoint *bp);
+    int (*breakpoint_check)(DisasContextBase *db, CPUState *cpu, int flags);
     void (*translate_insn)(DisasContextBase *db, CPUState *cpu);
     void (*tb_stop)(DisasContextBase *db, CPUState *cpu);
     void (*disas_log)(const DisasContextBase *db, CPUState *cpu);
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index a59eb7c11b..1c44d096d8 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -50,7 +50,6 @@ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
 void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
                      CPUState *cpu, TranslationBlock *tb, int max_insns)
 {
-    int bp_insn = 0;
     bool plugin_enabled;
 
     /* Initialize DisasContext */
@@ -91,19 +90,35 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
             CPUBreakpoint *bp;
             QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) {
                 if (bp->pc == db->pc_next) {
-                    if (ops->breakpoint_check(db, cpu, bp)) {
-                        bp_insn = 1;
-                        break;
+                    int len = ops->breakpoint_check(db, cpu, bp->flags);
+
+                    /*
+                     * The breakpoint_check hook may use DISAS_TOO_MANY
+                     * to indicate that only one more instruction is to
+                     * be executed.  Otherwise it should use DISAS_NORETURN
+                     * when generating an exception, but may use a
+                     * DISAS_TARGET_* value for Something Else.
+                     */
+                    if (db->is_jmp > DISAS_TOO_MANY) {
+                        /*
+                         * The address covered by the breakpoint must be
+                         * included in [tb->pc, tb->pc + tb->size) in order
+                         * to for it to be properly cleared.  Thus we
+                         * increment the PC here so that the logic setting
+                         * tb->size below does the right thing.
+                         */
+                        tcg_debug_assert(len > 0);
+                        db->pc_next += len;
+
+                        /*
+                         * The breakpoint definitely hit, so decrement the
+                         * number of instructions completed for icount.
+                         */
+                        db->num_insns--;
+                        goto done;
                     }
                 }
             }
-            /* The breakpoint_check hook may use DISAS_TOO_MANY to indicate
-               that only one more instruction is to be executed.  Otherwise
-               it should use DISAS_NORETURN when generating an exception,
-               but may use a DISAS_TARGET_* value for Something Else.  */
-            if (db->is_jmp > DISAS_TOO_MANY) {
-                break;
-            }
         }
 
         /* Disassemble one instruction.  The translate_insn hook should
@@ -142,9 +157,10 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
         }
     }
 
+ done:
     /* Emit code to exit the TB, as indicated by db->is_jmp.  */
     ops->tb_stop(db, cpu);
-    gen_tb_end(db->tb, db->num_insns - bp_insn);
+    gen_tb_end(db->tb, db->num_insns);
 
     if (plugin_enabled) {
         plugin_gen_tb_end(cpu);
diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index 833d3baa7b..07d0197cfb 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -3012,19 +3012,13 @@ static void alpha_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(dcbase->pc_next);
 }
 
-static bool alpha_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                      const CPUBreakpoint *bp)
+static int alpha_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                     int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     ctx->base.is_jmp = gen_excp(ctx, EXCP_DEBUG, 0);
-
-    /* The address covered by the breakpoint must be included in
-       [tb->pc, tb->pc + tb->size) in order to for it to be
-       properly cleared -- thus we increment the PC here so that
-       the logic setting tb->size below does the right thing.  */
-    ctx->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static void alpha_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index ca11a5fecd..4efd4e95d6 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -14844,28 +14844,22 @@ static void aarch64_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     dc->insn_start = tcg_last_op();
 }
 
-static bool aarch64_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                        const CPUBreakpoint *bp)
+static int aarch64_tr_breakpoint_check(DisasContextBase *dcbase,
+                                       CPUState *cpu, int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
-    if (bp->flags & BP_CPU) {
+    if (bp_flags & BP_CPU) {
         gen_a64_set_pc_im(dc->base.pc_next);
         gen_helper_check_breakpoints(cpu_env);
         /* End the TB early; it likely won't be executed */
         dc->base.is_jmp = DISAS_TOO_MANY;
     } else {
         gen_exception_internal_insn(dc, dc->base.pc_next, EXCP_DEBUG);
-        /* The address covered by the breakpoint must be
-           included in [tb->pc, tb->pc + tb->size) in order
-           to for it to be properly cleared -- thus we
-           increment the PC here so that the logic setting
-           tb->size below does the right thing.  */
-        dc->base.pc_next += 4;
         dc->base.is_jmp = DISAS_NORETURN;
     }
 
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index e1a8152598..ebac31c3ac 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9438,12 +9438,12 @@ static void arm_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     dc->insn_start = tcg_last_op();
 }
 
-static bool arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                    const CPUBreakpoint *bp)
+static int arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                   int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
-    if (bp->flags & BP_CPU) {
+    if (bp_flags & BP_CPU) {
         gen_set_condexec(dc);
         gen_set_pc_im(dc, dc->base.pc_next);
         gen_helper_check_breakpoints(cpu_env);
@@ -9451,18 +9451,14 @@ static bool arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
         dc->base.is_jmp = DISAS_TOO_MANY;
     } else {
         gen_exception_internal_insn(dc, dc->base.pc_next, EXCP_DEBUG);
-        /* The address covered by the breakpoint must be
-           included in [tb->pc, tb->pc + tb->size) in order
-           to for it to be properly cleared -- thus we
-           increment the PC here so that the logic setting
-           tb->size below does the right thing.  */
-        /* TODO: Advance PC by correct instruction length to
-         * avoid disassembler error messages */
-        dc->base.pc_next += 2;
         dc->base.is_jmp = DISAS_NORETURN;
     }
 
-    return true;
+    /*
+     * TODO: Advance PC by correct instruction length to avoid disassembler
+     * error messages.  In the meantime, minimum instruction length.
+     */
+    return 2;
 }
 
 static bool arm_pre_translate_insn(DisasContext *dc)
diff --git a/target/avr/translate.c b/target/avr/translate.c
index 8237a03c23..73ff467926 100644
--- a/target/avr/translate.c
+++ b/target/avr/translate.c
@@ -2944,13 +2944,13 @@ static void avr_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     tcg_gen_insn_start(ctx->npc);
 }
 
-static bool avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                    const CPUBreakpoint *bp)
+static int avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                   int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     gen_breakpoint(ctx);
-    return true;
+    return 2; /* minimum instruction length */
 }
 
 static void avr_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/cris/translate.c b/target/cris/translate.c
index 9258c13e9f..b590e79433 100644
--- a/target/cris/translate.c
+++ b/target/cris/translate.c
@@ -3118,8 +3118,8 @@ static void cris_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(dc->delayed_branch == 1 ? dc->ppc | 1 : dc->pc);
 }
 
-static bool cris_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                     const CPUBreakpoint *bp)
+static int cris_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                    int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
@@ -3127,14 +3127,8 @@ static bool cris_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
     tcg_gen_movi_tl(env_pc, dc->pc);
     t_gen_raise_exception(EXCP_DEBUG);
     dc->base.is_jmp = DISAS_NORETURN;
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    dc->pc += 2;
-    return true;
+
+    return 2; /* minimum instruction length */
 }
 
 static void cris_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index b23d36adf5..75c0d40a13 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -540,20 +540,13 @@ static void hexagon_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(ctx->base.pc_next);
 }
 
-static bool hexagon_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                        const CPUBreakpoint *bp)
+static int hexagon_tr_breakpoint_check(DisasContextBase *dcbase,
+                                       CPUState *cpu, int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     gen_exception_end_tb(ctx, EXCP_DEBUG);
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    ctx->base.pc_next += 4;
-    return true;
+    return 4; /* minimum packet length */
 }
 
 static bool pkt_crosses_page(CPUHexagonState *env, DisasContext *ctx)
diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 835120c038..407136f26b 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -4204,14 +4204,13 @@ static void hppa_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     tcg_gen_insn_start(ctx->iaoq_f, ctx->iaoq_b);
 }
 
-static bool hppa_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                      const CPUBreakpoint *bp)
+static int hppa_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                    int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     gen_excp(ctx, EXCP_DEBUG);
-    ctx->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static void hppa_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 3814ce2a3e..bf5a393b9b 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -8583,23 +8583,16 @@ static void i386_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(dc->base.pc_next, dc->cc_op);
 }
 
-static bool i386_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                     const CPUBreakpoint *bp)
+static int i386_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                    int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
     /* If RF is set, suppress an internally generated breakpoint.  */
     int flags = dc->base.tb->flags & HF_RF_MASK ? BP_GDB : BP_ANY;
-    if (bp->flags & flags) {
+    if (bp_flags & flags) {
         gen_debug(dc);
-        /* The address covered by the breakpoint must be included in
-           [tb->pc, tb->pc + tb->size) in order to for it to be
-           properly cleared -- thus we increment the PC here so that
-           the generic logic setting tb->size later does the right thing.  */
-        dc->base.pc_next += 1;
-        return true;
-    } else {
-        return false;
     }
+    return 1; /* minimum instruction length */
 }
 
 static void i386_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 1fee04b8dd..79c1847d54 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -6208,21 +6208,13 @@ static void m68k_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(dc->base.pc_next, dc->cc_op);
 }
 
-static bool m68k_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                     const CPUBreakpoint *bp)
+static int m68k_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                    int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
     gen_exception(dc, dc->base.pc_next, EXCP_DEBUG);
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    dc->base.pc_next += 2;
-
-    return true;
+    return 2; /* minimum instruction length */
 }
 
 static void m68k_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/microblaze/translate.c b/target/microblaze/translate.c
index c68a84a219..09a364cceb 100644
--- a/target/microblaze/translate.c
+++ b/target/microblaze/translate.c
@@ -1673,21 +1673,13 @@ static void mb_tr_insn_start(DisasContextBase *dcb, CPUState *cs)
     dc->insn_start = tcg_last_op();
 }
 
-static bool mb_tr_breakpoint_check(DisasContextBase *dcb, CPUState *cs,
-                                   const CPUBreakpoint *bp)
+static int mb_tr_breakpoint_check(DisasContextBase *dcb, CPUState *cs,
+                                  int bp_flags)
 {
     DisasContext *dc = container_of(dcb, DisasContext, base);
 
     gen_raise_exception_sync(dc, EXCP_DEBUG);
-
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    dc->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static void mb_tr_translate_insn(DisasContextBase *dcb, CPUState *cs)
diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
index 47c967acbf..c7b9d813c2 100644
--- a/target/mips/tcg/translate.c
+++ b/target/mips/tcg/translate.c
@@ -16190,22 +16190,16 @@ static void mips_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
                        ctx->btarget);
 }
 
-static bool mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                     const CPUBreakpoint *bp)
+static int mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                    int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     save_cpu_state(ctx, 1);
     ctx->base.is_jmp = DISAS_NORETURN;
     gen_helper_raise_exception_debug(cpu_env);
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    ctx->base.pc_next += 4;
-    return true;
+
+    return 2; /* minimum instruction length */
 }
 
 static void mips_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index 17742cebc7..1d1c66b88f 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -777,20 +777,13 @@ static void nios2_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     tcg_gen_insn_start(dcbase->pc_next);
 }
 
-static bool nios2_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                      const CPUBreakpoint *bp)
+static int nios2_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                     int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
     gen_exception(dc, EXCP_DEBUG);
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    dc->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static void nios2_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index 37c3e3e0a3..adcf71645b 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -1639,20 +1639,15 @@ static void openrisc_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
                        | (dc->base.num_insns > 1 ? 2 : 0));
 }
 
-static bool openrisc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                         const CPUBreakpoint *bp)
+static int openrisc_tr_breakpoint_check(DisasContextBase *dcbase,
+                                        CPUState *cs, int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
     tcg_gen_movi_tl(cpu_pc, dc->base.pc_next);
     gen_exception(dc, EXCP_DEBUG);
     dc->base.is_jmp = DISAS_NORETURN;
-    /* The address covered by the breakpoint must be included in
-       [tb->pc, tb->pc + tb->size) in order to for it to be
-       properly cleared -- thus we increment the PC here so that
-       the logic setting tb->size below does the right thing.  */
-    dc->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static void openrisc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 0a55cb7181..5093be0694 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -8565,21 +8565,14 @@ static void ppc_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     tcg_gen_insn_start(dcbase->pc_next);
 }
 
-static bool ppc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                    const CPUBreakpoint *bp)
+static int ppc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                   int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     gen_update_nip(ctx, ctx->base.pc_next);
     gen_debug_exception(ctx);
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be properly
-     * cleared -- thus we increment the PC here so that the logic
-     * setting tb->size below does the right thing.
-     */
-    ctx->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static bool is_prefix_insn(DisasContext *ctx, uint32_t insn)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index deda0c8a44..8a6bc58572 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -961,20 +961,15 @@ static void riscv_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(ctx->base.pc_next);
 }
 
-static bool riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                      const CPUBreakpoint *bp)
+static int riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                     int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     tcg_gen_movi_tl(cpu_pc, ctx->base.pc_next);
     ctx->base.is_jmp = DISAS_NORETURN;
     gen_exception_debug();
-    /* The address covered by the breakpoint must be included in
-       [tb->pc, tb->pc + tb->size) in order to for it to be
-       properly cleared -- thus we increment the PC here so that
-       the logic setting tb->size below does the right thing.  */
-    ctx->base.pc_next += 4;
-    return true;
+    return 2; /* minimum instruction length */
 }
 
 static void riscv_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/rx/translate.c b/target/rx/translate.c
index 23a626438a..5e9950f3ac 100644
--- a/target/rx/translate.c
+++ b/target/rx/translate.c
@@ -2309,8 +2309,8 @@ static void rx_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     tcg_gen_insn_start(ctx->base.pc_next);
 }
 
-static bool rx_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                    const CPUBreakpoint *bp)
+static int rx_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                  int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
@@ -2318,8 +2318,8 @@ static bool rx_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
     tcg_gen_movi_i32(cpu_pc, ctx->base.pc_next);
     gen_helper_debug(cpu_env);
     ctx->base.is_jmp = DISAS_NORETURN;
-    ctx->base.pc_next += 1;
-    return true;
+
+    return 1; /* minimum instruction length */
 }
 
 static void rx_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index c8d55d1f83..06f8fa96a6 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -6567,8 +6567,8 @@ static void s390x_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
 {
 }
 
-static bool s390x_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                      const CPUBreakpoint *bp)
+static int s390x_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                     int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
@@ -6582,12 +6582,8 @@ static bool s390x_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
 
     dc->base.is_jmp = DISAS_PC_STALE;
     dc->do_debug = true;
-    /* The address covered by the breakpoint must be included in
-       [tb->pc, tb->pc + tb->size) in order to for it to be
-       properly cleared -- thus we increment the PC here so that
-       the logic setting tb->size does the right thing.  */
-    dc->base.pc_next += 2;
-    return true;
+
+    return 2; /* minimum instruction length */
 }
 
 static void s390x_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 40898e2393..b1e19bf976 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -2289,8 +2289,8 @@ static void sh4_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     tcg_gen_insn_start(ctx->base.pc_next, ctx->envflags);
 }
 
-static bool sh4_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                    const CPUBreakpoint *bp)
+static int sh4_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                   int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
@@ -2298,12 +2298,8 @@ static bool sh4_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
     gen_save_cpu_state(ctx, true);
     gen_helper_debug(cpu_env);
     ctx->base.is_jmp = DISAS_NORETURN;
-    /* The address covered by the breakpoint must be included in
-       [tb->pc, tb->pc + tb->size) in order to for it to be
-       properly cleared -- thus we increment the PC here so that
-       the logic setting tb->size below does the right thing.  */
-    ctx->base.pc_next += 2;
-    return true;
+
+    return 2; /* minimum instruction length */
 }
 
 static void sh4_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index e530cb4aa8..d6b554cefe 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -5854,8 +5854,8 @@ static void sparc_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     }
 }
 
-static bool sparc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                      const CPUBreakpoint *bp)
+static int sparc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                     int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
@@ -5865,9 +5865,8 @@ static bool sparc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
     gen_helper_debug(cpu_env);
     tcg_gen_exit_tb(NULL, 0);
     dc->base.is_jmp = DISAS_NORETURN;
-    /* update pc_next so that the current instruction is included in tb->size */
-    dc->base.pc_next += 4;
-    return true;
+
+    return 4; /* minimum instruction length */
 }
 
 static void sparc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/tricore/translate.c b/target/tricore/translate.c
index 865020754d..8c39134d52 100644
--- a/target/tricore/translate.c
+++ b/target/tricore/translate.c
@@ -8810,19 +8810,12 @@ static void tricore_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(ctx->base.pc_next);
 }
 
-static bool tricore_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                      const CPUBreakpoint *bp)
+static int tricore_tr_breakpoint_check(DisasContextBase *dcbase,
+                                       CPUState *cpu, int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
     generate_qemu_excp(ctx, EXCP_DEBUG);
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    ctx->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static bool insn_crosses_page(CPUTriCoreState *env, DisasContext *ctx)
diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index 7094cfcf1d..6e7ad266f4 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -1232,20 +1232,16 @@ static void xtensa_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(dcbase->pc_next);
 }
 
-static bool xtensa_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                       const CPUBreakpoint *bp)
+static int xtensa_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                      int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
     tcg_gen_movi_i32(cpu_pc, dc->base.pc_next);
     gen_exception(dc, EXCP_DEBUG);
     dc->base.is_jmp = DISAS_NORETURN;
-    /* The address covered by the breakpoint must be included in
-       [tb->pc, tb->pc + tb->size) in order to for it to be
-       properly cleared -- thus we increment the PC here so that
-       the logic setting tb->size below does the right thing.  */
-    dc->base.pc_next += 2;
-    return true;
+
+    return 2; /* minimum instruction length */
 }
 
 static void xtensa_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 09/10] accel/tcg: Hoist tb_cflags to a local in translator_loop
  2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
                   ` (7 preceding siblings ...)
  2021-07-12 15:40 ` [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check Richard Henderson
@ 2021-07-12 15:40 ` Richard Henderson
  2021-07-17 17:50   ` Alex Bennée
  2021-07-12 15:40 ` [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags Richard Henderson
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug

The access internal to tb_cflags() is atomic.
Avoid re-reading it as such for the multiple uses.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/translator.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 1c44d096d8..449159a27c 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -50,6 +50,7 @@ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
 void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
                      CPUState *cpu, TranslationBlock *tb, int max_insns)
 {
+    uint32_t cflags = tb_cflags(tb);
     bool plugin_enabled;
 
     /* Initialize DisasContext */
@@ -72,8 +73,7 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
     ops->tb_start(db, cpu);
     tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
 
-    plugin_enabled = plugin_gen_tb_start(cpu, tb,
-                                         tb_cflags(db->tb) & CF_MEMI_ONLY);
+    plugin_enabled = plugin_gen_tb_start(cpu, tb, cflags & CF_MEMI_ONLY);
 
     while (true) {
         db->num_insns++;
@@ -125,14 +125,13 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
            update db->pc_next and db->is_jmp to indicate what should be
            done next -- either exiting this loop or locate the start of
            the next instruction.  */
-        if (db->num_insns == db->max_insns
-            && (tb_cflags(db->tb) & CF_LAST_IO)) {
+        if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
             /* Accept I/O on the last instruction.  */
             gen_io_start();
             ops->translate_insn(db, cpu);
         } else {
             /* we should only see CF_MEMI_ONLY for io_recompile */
-            tcg_debug_assert(!(tb_cflags(db->tb) & CF_MEMI_ONLY));
+            tcg_debug_assert(!(cflags & CF_MEMI_ONLY));
             ops->translate_insn(db, cpu);
         }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags
  2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
                   ` (8 preceding siblings ...)
  2021-07-12 15:40 ` [PATCH v2 09/10] accel/tcg: Hoist tb_cflags to a local in translator_loop Richard Henderson
@ 2021-07-12 15:40 ` Richard Henderson
  2021-07-17 17:58   ` Peter Maydell
  2021-07-12 18:20 ` [PATCH v2 00/10] tcg: breakpoint reorg Mark Cave-Ayland
  2021-07-17 17:16 ` Richard Henderson
  11 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug

Having this data in cflags means that hashing takes care
of selecting a TB with or without exceptions built in.
Which means that we no longer need to flush all TBs.

This does require that we single-step while we're within a page
that contains a breakpoint, so it's not yet ideal, but should be
an improvement over some corner-case slowdowns.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/404
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h   |  7 ++++
 accel/tcg/cpu-exec.c      | 68 ++++++++++++++++++++++++++++++-
 accel/tcg/translate-all.c |  4 --
 accel/tcg/translator.c    | 85 +++++++++++++++++++++------------------
 cpu.c                     | 24 -----------
 5 files changed, 119 insertions(+), 69 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 6873cce8df..7ab2578f71 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -502,9 +502,16 @@ struct TranslationBlock {
 #define CF_USE_ICOUNT    0x00020000
 #define CF_INVALID       0x00040000 /* TB is stale. Set with @jmp_lock held */
 #define CF_PARALLEL      0x00080000 /* Generate code for a parallel context */
+#define CF_BP_MASK       0x00300000 /* See below */
+#define CF_BP_SHIFT      20
 #define CF_CLUSTER_MASK  0xff000000 /* Top 8 bits are cluster ID */
 #define CF_CLUSTER_SHIFT 24
 
+#define CF_BP_NONE       (0 << CF_BP_SHIFT) /* TB does not interact with BPs */
+#define CF_BP_SSTEP      (1 << CF_BP_SHIFT) /* gdbstub single-step in effect */
+#define CF_BP_GDB        (2 << CF_BP_SHIFT) /* gdbstub breakpoint at tb->pc */
+#define CF_BP_CPU        (3 << CF_BP_SHIFT) /* arch breakpoint at tb->pc */
+
     /* Per-vCPU dynamic tracing state used to generate this TB */
     uint32_t trace_vcpu_dstate;
 
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 4d043a11aa..179a425ece 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -222,6 +222,65 @@ static inline void log_cpu_exec(target_ulong pc, CPUState *cpu,
     }
 }
 
+static uint32_t cflags_for_breakpoints(CPUState *cpu, target_ulong pc,
+                                       uint32_t cflags)
+{
+    uint32_t bflags = 0;
+
+    if (unlikely(cpu->singlestep_enabled)) {
+        bflags = CF_BP_SSTEP;
+    } else {
+        bool match_page = false;
+        CPUBreakpoint *bp;
+
+        QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) {
+            /* Note exact pc matches */
+            if (pc == bp->pc) {
+                if (bp->flags & BP_GDB) {
+                    bflags = CF_BP_GDB;
+                    break;
+                }
+                if (bp->flags & BP_CPU) {
+                    bflags = CF_BP_CPU;
+                    break;
+                }
+            }
+            /* Note page matches */
+            if (((pc ^ bp->pc) & TARGET_PAGE_MASK) == 0) {
+                match_page = true;
+            }
+        }
+
+        if (likely(!bflags)) {
+            if (likely(!match_page)) {
+                return cflags;
+            }
+
+            /*
+             * Within the same page as a breakpoint, single-step,
+             * returning to helper_lookup_tb_ptr after each looking
+             * for the actual breakpoint.
+             *
+             * TODO: Perhaps better to record all of the TBs associated
+             * with a given virtual page that contains a breakpoint, and
+             * then invalidate them when a new overlapping breakpoint is
+             * set on the page.  Non-overlapping TBs would not be
+             * invalidated, nor would any TB need to be invalidated as
+             * breakpoints are removed.
+             */
+            bflags = CF_NO_GOTO_TB;
+        }
+    }
+
+    /*
+     * Reduce the TB to one insn.
+     * In the case of a BP hit, we will be raising an exception anyway.
+     * In the case of a page hit, return to helper_lookup_tb_ptr after
+     * every insn to look for the breakpoint.
+     */
+    return (cflags & ~CF_COUNT_MASK) | 1 | bflags;
+}
+
 /**
  * helper_lookup_tb_ptr: quick check for next tb
  * @env: current cpu state
@@ -235,11 +294,13 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
     CPUState *cpu = env_cpu(env);
     TranslationBlock *tb;
     target_ulong cs_base, pc;
-    uint32_t flags;
+    uint32_t flags, cflags;
 
     cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
 
-    tb = tb_lookup(cpu, pc, cs_base, flags, curr_cflags(cpu));
+    cflags = cflags_for_breakpoints(cpu, pc, curr_cflags(cpu));
+
+    tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
     if (tb == NULL) {
         return tcg_code_gen_epilogue;
     }
@@ -346,6 +407,8 @@ void cpu_exec_step_atomic(CPUState *cpu)
         cflags &= ~CF_PARALLEL;
         /* After 1 insn, return and release the exclusive lock. */
         cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR | 1;
+        /* Raise any post-instruction debug exceptions. */
+        cflags = cflags_for_breakpoints(cpu, pc, cflags);
 
         tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
         if (tb == NULL) {
@@ -524,6 +587,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
     } else {
         cpu->cflags_next_tb = -1;
     }
+    cflags = cflags_for_breakpoints(cpu, pc, cflags);
 
     tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
     if (tb == NULL) {
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 491c1a56b2..1f98078608 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1432,10 +1432,6 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     }
     QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
 
-    if (cpu->singlestep_enabled) {
-        max_insns = 1;
-    }
-
  buffer_overflow:
     tb = tcg_tb_alloc(tcg_ctx);
     if (unlikely(!tb)) {
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 449159a27c..01b48137da 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -33,13 +33,8 @@ void translator_loop_temp_check(DisasContextBase *db)
 
 bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
 {
-    /* Suppress goto_tb if requested. */
-    if (tb_cflags(db->tb) & CF_NO_GOTO_TB) {
-        return false;
-    }
-
-    /* Suppress goto_tb in the case of single-steping.  */
-    if (db->singlestep_enabled) {
+    /* Suppress goto_tb if requested, or required by breakpoints. */
+    if (tb_cflags(db->tb) & (CF_NO_GOTO_TB | CF_BP_MASK)) {
         return false;
     }
 
@@ -51,6 +46,7 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
                      CPUState *cpu, TranslationBlock *tb, int max_insns)
 {
     uint32_t cflags = tb_cflags(tb);
+    int bp_flags = 0;
     bool plugin_enabled;
 
     /* Initialize DisasContext */
@@ -60,7 +56,23 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
     db->is_jmp = DISAS_NEXT;
     db->num_insns = 0;
     db->max_insns = max_insns;
-    db->singlestep_enabled = cpu->singlestep_enabled;
+    db->singlestep_enabled = false;
+
+    switch (cflags & CF_BP_MASK) {
+    case CF_BP_NONE:
+        break;
+    case CF_BP_SSTEP:
+        db->singlestep_enabled = true;
+        break;
+    case CF_BP_GDB:
+        bp_flags = BP_GDB;
+        break;
+    case CF_BP_CPU:
+        bp_flags = BP_CPU;
+        break;
+    default:
+        g_assert_not_reached();
+    }
 
     ops->init_disas_context(db, cpu);
     tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
@@ -85,39 +97,34 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
         }
 
         /* Pass breakpoint hits to target for further processing */
-        if (!db->singlestep_enabled
-            && unlikely(!QTAILQ_EMPTY(&cpu->breakpoints))) {
-            CPUBreakpoint *bp;
-            QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) {
-                if (bp->pc == db->pc_next) {
-                    int len = ops->breakpoint_check(db, cpu, bp->flags);
+        if (unlikely(bp_flags)) {
+            int len = ops->breakpoint_check(db, cpu, bp_flags);
 
-                    /*
-                     * The breakpoint_check hook may use DISAS_TOO_MANY
-                     * to indicate that only one more instruction is to
-                     * be executed.  Otherwise it should use DISAS_NORETURN
-                     * when generating an exception, but may use a
-                     * DISAS_TARGET_* value for Something Else.
-                     */
-                    if (db->is_jmp > DISAS_TOO_MANY) {
-                        /*
-                         * The address covered by the breakpoint must be
-                         * included in [tb->pc, tb->pc + tb->size) in order
-                         * to for it to be properly cleared.  Thus we
-                         * increment the PC here so that the logic setting
-                         * tb->size below does the right thing.
-                         */
-                        tcg_debug_assert(len > 0);
-                        db->pc_next += len;
+            /*
+             * When there is a bp hit, we're going to execute a maximum
+             * of one instruction.  The breakpoint_check hook may use
+             * DISAS_NEXT or DISAS_TOO_MANY to indicate that the current
+             * instruction should be translated.  Anything else is taken
+             * to mean that an exception has been raised and that zero
+             * instructions will be executed.
+             */
+            if (db->is_jmp > DISAS_TOO_MANY) {
+                /*
+                 * The address covered by the breakpoint must be
+                 * included in [tb->pc, tb->pc + tb->size) in order
+                 * to for it to be properly cleared.  Thus we
+                 * increment the PC here so that the logic setting
+                 * tb->size below does the right thing.
+                 */
+                tcg_debug_assert(len > 0);
+                db->pc_next += len;
 
-                        /*
-                         * The breakpoint definitely hit, so decrement the
-                         * number of instructions completed for icount.
-                         */
-                        db->num_insns--;
-                        goto done;
-                    }
-                }
+                /*
+                 * The breakpoint definitely hit, so decrement the
+                 * number of instructions completed for icount.
+                 */
+                db->num_insns--;
+                goto done;
             }
         }
 
diff --git a/cpu.c b/cpu.c
index 83059537d7..addcb5db9c 100644
--- a/cpu.c
+++ b/cpu.c
@@ -225,11 +225,6 @@ void tb_invalidate_phys_addr(target_ulong addr)
     tb_invalidate_phys_page_range(addr, addr + 1);
     mmap_unlock();
 }
-
-static void breakpoint_invalidate(CPUState *cpu, target_ulong pc)
-{
-    tb_invalidate_phys_addr(pc);
-}
 #else
 void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs)
 {
@@ -250,17 +245,6 @@ void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs)
     ram_addr = memory_region_get_ram_addr(mr) + addr;
     tb_invalidate_phys_page_range(ram_addr, ram_addr + 1);
 }
-
-static void breakpoint_invalidate(CPUState *cpu, target_ulong pc)
-{
-    /*
-     * There may not be a virtual to physical translation for the pc
-     * right now, but there may exist cached TB for this pc.
-     * Flush the whole TB cache to force re-translation of such TBs.
-     * This is heavyweight, but we're debugging anyway.
-     */
-    tb_flush(cpu);
-}
 #endif
 
 /* Add a breakpoint.  */
@@ -281,8 +265,6 @@ int cpu_breakpoint_insert(CPUState *cpu, vaddr pc, int flags,
         QTAILQ_INSERT_TAIL(&cpu->breakpoints, bp, entry);
     }
 
-    breakpoint_invalidate(cpu, pc);
-
     if (breakpoint) {
         *breakpoint = bp;
     }
@@ -310,8 +292,6 @@ void cpu_breakpoint_remove_by_ref(CPUState *cpu, CPUBreakpoint *bp)
 {
     QTAILQ_REMOVE(&cpu->breakpoints, bp, entry);
 
-    breakpoint_invalidate(cpu, bp->pc);
-
     trace_breakpoint_remove(cpu->cpu_index, bp->pc, bp->flags);
     g_free(bp);
 }
@@ -336,10 +316,6 @@ void cpu_single_step(CPUState *cpu, int enabled)
         cpu->singlestep_enabled = enabled;
         if (kvm_enabled()) {
             kvm_update_guest_debug(cpu, 0);
-        } else {
-            /* must flush all the translated code to avoid inconsistencies */
-            /* XXX: only flush what is necessary */
-            tb_flush(cpu);
         }
         trace_breakpoint_singlestep(cpu->cpu_index, enabled);
     }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 00/10] tcg: breakpoint reorg
  2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
                   ` (9 preceding siblings ...)
  2021-07-12 15:40 ` [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags Richard Henderson
@ 2021-07-12 18:20 ` Mark Cave-Ayland
  2021-07-17 17:16 ` Richard Henderson
  11 siblings, 0 replies; 37+ messages in thread
From: Mark Cave-Ayland @ 2021-07-12 18:20 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: alex.bennee, f4bug

On 12/07/2021 16:39, Richard Henderson wrote:

> This is fixing #404 ("windows xp boot takes much longer...")
> and several other similar reports.
> 
> For v2, all prerequisites and 7 of the patches from v1 with
> reviews are now upstream.
> 
> Mark Cave-Ayland reported success with WinXP with v1, with
> this patch set being even faster than b55f54bc~1.  Which was
> a bit of a surprise, but I'll take it.  It means that it's
> probably not worth making the breakpoint detection scheme
> any more complicated.
> 
> I'd still like some more feedback.  Given this is fixing a
> regression from qemu 5.2 I feel comfortable delaying this
> past soft freeze, but not past hard freeze on the 20th.
> 
> 
> r~
> 
> 
> Richard Henderson (10):
>    accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
>    accel/tcg: Move curr_cflags into cpu-exec.c
>    accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
>    accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
>    accel/tcg: Handle -singlestep in curr_cflags
>    accel/tcg: Use CF_NO_GOTO_{TB,PTR} in cpu_exec_step_atomic
>    accel/tcg: Move cflags lookup into tb_find
>    accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
>    accel/tcg: Hoist tb_cflags to a local in translator_loop
>    accel/tcg: Encode breakpoint info into tb->cflags
> 
>   include/exec/exec-all.h       |  30 +++++---
>   include/exec/translator.h     |  17 +++--
>   accel/tcg/cpu-exec.c          | 130 ++++++++++++++++++++++++++++------
>   accel/tcg/translate-all.c     |   7 +-
>   accel/tcg/translator.c        |  79 ++++++++++++++-------
>   cpu.c                         |  24 -------
>   target/alpha/translate.c      |  12 +---
>   target/arm/translate-a64.c    |  14 ++--
>   target/arm/translate.c        |  20 +++---
>   target/avr/translate.c        |   6 +-
>   target/cris/translate.c       |  14 ++--
>   target/hexagon/translate.c    |  13 +---
>   target/hppa/translate.c       |   7 +-
>   target/i386/tcg/translate.c   |  15 ++--
>   target/m68k/translate.c       |  14 +---
>   target/microblaze/translate.c |  14 +---
>   target/mips/tcg/translate.c   |  14 ++--
>   target/nios2/translate.c      |  13 +---
>   target/openrisc/translate.c   |  11 +--
>   target/ppc/translate.c        |  13 +---
>   target/riscv/translate.c      |  11 +--
>   target/rx/translate.c         |   8 +--
>   target/s390x/translate.c      |  12 ++--
>   target/sh4/translate.c        |  12 ++--
>   target/sparc/translate.c      |   9 ++-
>   target/tricore/translate.c    |  13 +---
>   target/xtensa/translate.c     |  12 ++--
>   tcg/tcg-op.c                  |  28 ++++----
>   28 files changed, 280 insertions(+), 292 deletions(-)

FWIW I've just tested this v2 patchset on top of git master (bd38ae26ce) and I still 
see the same improvement i.e. WinXP boot to the login screen goes down from 2m 38s to 
25s so:

Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>


ATB,

Mark.


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 00/10] tcg: breakpoint reorg
  2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
                   ` (10 preceding siblings ...)
  2021-07-12 18:20 ` [PATCH v2 00/10] tcg: breakpoint reorg Mark Cave-Ayland
@ 2021-07-17 17:16 ` Richard Henderson
  11 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 17:16 UTC (permalink / raw)
  To: qemu-devel, Peter Maydell; +Cc: mark.cave-ayland, alex.bennee, f4bug

Ping.  A Tested-by is nice, but Reviewed-by is better,
and time is running out, even for bug fixes.

r~

On 7/12/21 8:39 AM, Richard Henderson wrote:
> This is fixing #404 ("windows xp boot takes much longer...")
> and several other similar reports.
> 
> For v2, all prerequisites and 7 of the patches from v1 with
> reviews are now upstream.
> 
> Mark Cave-Ayland reported success with WinXP with v1, with
> this patch set being even faster than b55f54bc~1.  Which was
> a bit of a surprise, but I'll take it.  It means that it's
> probably not worth making the breakpoint detection scheme
> any more complicated.
> 
> I'd still like some more feedback.  Given this is fixing a
> regression from qemu 5.2 I feel comfortable delaying this
> past soft freeze, but not past hard freeze on the 20th.
> 
> 
> r~
> 
> 
> Richard Henderson (10):
>    accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
>    accel/tcg: Move curr_cflags into cpu-exec.c
>    accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
>    accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
>    accel/tcg: Handle -singlestep in curr_cflags
>    accel/tcg: Use CF_NO_GOTO_{TB,PTR} in cpu_exec_step_atomic
>    accel/tcg: Move cflags lookup into tb_find
>    accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
>    accel/tcg: Hoist tb_cflags to a local in translator_loop
>    accel/tcg: Encode breakpoint info into tb->cflags
> 
>   include/exec/exec-all.h       |  30 +++++---
>   include/exec/translator.h     |  17 +++--
>   accel/tcg/cpu-exec.c          | 130 ++++++++++++++++++++++++++++------
>   accel/tcg/translate-all.c     |   7 +-
>   accel/tcg/translator.c        |  79 ++++++++++++++-------
>   cpu.c                         |  24 -------
>   target/alpha/translate.c      |  12 +---
>   target/arm/translate-a64.c    |  14 ++--
>   target/arm/translate.c        |  20 +++---
>   target/avr/translate.c        |   6 +-
>   target/cris/translate.c       |  14 ++--
>   target/hexagon/translate.c    |  13 +---
>   target/hppa/translate.c       |   7 +-
>   target/i386/tcg/translate.c   |  15 ++--
>   target/m68k/translate.c       |  14 +---
>   target/microblaze/translate.c |  14 +---
>   target/mips/tcg/translate.c   |  14 ++--
>   target/nios2/translate.c      |  13 +---
>   target/openrisc/translate.c   |  11 +--
>   target/ppc/translate.c        |  13 +---
>   target/riscv/translate.c      |  11 +--
>   target/rx/translate.c         |   8 +--
>   target/s390x/translate.c      |  12 ++--
>   target/sh4/translate.c        |  12 ++--
>   target/sparc/translate.c      |   9 ++-
>   target/tricore/translate.c    |  13 +---
>   target/xtensa/translate.c     |  12 ++--
>   tcg/tcg-op.c                  |  28 ++++----
>   28 files changed, 280 insertions(+), 292 deletions(-)
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
  2021-07-12 15:39 ` [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
@ 2021-07-17 17:30   ` Peter Maydell
  2021-07-17 18:31     ` Richard Henderson
  2021-07-17 17:33   ` Alex Bennée
  1 sibling, 1 reply; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:30 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
	Philippe Mathieu-Daudé

On Mon, 12 Jul 2021 at 16:42, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The space reserved for CF_COUNT_MASK was overly large.
> Reduce to free up cflags bits and eliminate an extra test.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  include/exec/exec-all.h   | 4 +++-
>  accel/tcg/translate-all.c | 5 ++---
>  2 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 754f4130c9..dfe82ed19c 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -492,7 +492,9 @@ struct TranslationBlock {
>      target_ulong cs_base; /* CS base for this block */
>      uint32_t flags; /* flags defining in which context the code was generated */
>      uint32_t cflags;    /* compile flags */
> -#define CF_COUNT_MASK  0x00007fff
> +
> +/* Note that TCG_MAX_INSNS is 512; we validate this match elsewhere. */
> +#define CF_COUNT_MASK  0x000001ff
>  #define CF_LAST_IO     0x00008000 /* Last insn may be an IO access.  */
>  #define CF_MEMI_ONLY   0x00010000 /* Only instrument memory ops */
>  #define CF_USE_ICOUNT  0x00020000
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index 4df26de858..997e44c73b 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -1430,9 +1430,8 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>      if (max_insns == 0) {
>          max_insns = CF_COUNT_MASK;
>      }
> -    if (max_insns > TCG_MAX_INSNS) {
> -        max_insns = TCG_MAX_INSNS;
> -    }
> +    QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
> +

Previously we would allow max_insns = TCG_MAX_INSNS (512).
Now we only allow it to be 511...

-- PMM


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c
  2021-07-12 15:39 ` [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c Richard Henderson
@ 2021-07-17 17:32   ` Peter Maydell
  2021-07-17 17:34   ` Alex Bennée
  1 sibling, 0 replies; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:32 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
	Philippe Mathieu-Daudé

On Mon, 12 Jul 2021 at 16:44, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> We will shortly have more than a simple member read here,
> with stuff not necessarily exposed to exec/exec-all.h.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

(apologies if you saw a no-content email: gmail burp)

thanks
-- PMM


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
  2021-07-12 15:39 ` [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
  2021-07-17 17:30   ` Peter Maydell
@ 2021-07-17 17:33   ` Alex Bennée
  1 sibling, 0 replies; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:33 UTC (permalink / raw)
  To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug


Richard Henderson <richard.henderson@linaro.org> writes:

> The space reserved for CF_COUNT_MASK was overly large.
> Reduce to free up cflags bits and eliminate an extra test.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c
  2021-07-12 15:39 ` [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c Richard Henderson
  2021-07-17 17:32   ` Peter Maydell
@ 2021-07-17 17:34   ` Alex Bennée
  1 sibling, 0 replies; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:34 UTC (permalink / raw)
  To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug


Richard Henderson <richard.henderson@linaro.org> writes:

> We will shortly have more than a simple member read here,
> with stuff not necessarily exposed to exec/exec-all.h.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
  2021-07-12 15:39 ` [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR Richard Henderson
@ 2021-07-17 17:36   ` Alex Bennée
  2021-07-17 17:36   ` Peter Maydell
  1 sibling, 0 replies; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:36 UTC (permalink / raw)
  To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug


Richard Henderson <richard.henderson@linaro.org> writes:

> Move the -d nochain check to bits on tb->cflags.
> These will be used for more than -d nochain shortly.
>
> Set bits during curr_cflags, test them in translator_use_goto_tb,
> assert we're not doing anything odd in tcg_gen_goto_tb.  The test
> in tcg_gen_exit_tb is redundant with the assert for goto_tb_issue_mask.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
  2021-07-12 15:39 ` [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR Richard Henderson
  2021-07-17 17:36   ` Alex Bennée
@ 2021-07-17 17:36   ` Peter Maydell
  1 sibling, 0 replies; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:36 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
	Philippe Mathieu-Daudé

On Mon, 12 Jul 2021 at 16:44, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Move the -d nochain check to bits on tb->cflags.
> These will be used for more than -d nochain shortly.
>
> Set bits during curr_cflags, test them in translator_use_goto_tb,
> assert we're not doing anything odd in tcg_gen_goto_tb.  The test
> in tcg_gen_exit_tb is redundant with the assert for goto_tb_issue_mask.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
  2021-07-12 15:39 ` [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain Richard Henderson
@ 2021-07-17 17:39   ` Peter Maydell
  2021-07-17 17:39   ` Alex Bennée
  1 sibling, 0 replies; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:39 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
	Philippe Mathieu-Daudé

On Mon, 12 Jul 2021 at 16:44, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The purpose of suppressing goto_ptr from -d nochain had been
> to return to the main loop so that -d cpu would be recognized.
> But we now include -d cpu logging in helper_lookup_tb_ptr so
> there is no need to exclude goto_ptr.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  accel/tcg/cpu-exec.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index d3232d5764..70ea3c7d68 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -150,7 +150,7 @@ uint32_t curr_cflags(CPUState *cpu)
>      uint32_t cflags = cpu->tcg_cflags;
>
>      if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
> -        cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
> +        cflags |= CF_NO_GOTO_TB;
>      }

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
  2021-07-12 15:39 ` [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain Richard Henderson
  2021-07-17 17:39   ` Peter Maydell
@ 2021-07-17 17:39   ` Alex Bennée
  2021-07-17 18:38     ` Richard Henderson
  1 sibling, 1 reply; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:39 UTC (permalink / raw)
  To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug


Richard Henderson <richard.henderson@linaro.org> writes:

> The purpose of suppressing goto_ptr from -d nochain had been
> to return to the main loop so that -d cpu would be recognized.

Hmm is it though? I've always treated it as ensuring we always come out
into the main loop (which is helpful for debugging).

> But we now include -d cpu logging in helper_lookup_tb_ptr so
> there is no need to exclude goto_ptr.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  accel/tcg/cpu-exec.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index d3232d5764..70ea3c7d68 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -150,7 +150,7 @@ uint32_t curr_cflags(CPUState *cpu)
>      uint32_t cflags = cpu->tcg_cflags;
>  
>      if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
> -        cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
> +        cflags |= CF_NO_GOTO_TB;
>      }
>  
>      return cflags;


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags
  2021-07-12 15:39 ` [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags Richard Henderson
@ 2021-07-17 17:41   ` Peter Maydell
  2021-07-17 17:42   ` Alex Bennée
  1 sibling, 0 replies; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:41 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
	Philippe Mathieu-Daudé

On Mon, 12 Jul 2021 at 16:42, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Exchange the test in translator_use_goto_tb for CF_NO_GOTO_TB,
> and the test in tb_gen_code for setting CF_COUNT_MASK to 1.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  accel/tcg/cpu-exec.c      | 8 +++++++-
>  accel/tcg/translate-all.c | 2 +-
>  accel/tcg/translator.c    | 2 +-
>  3 files changed, 9 insertions(+), 3 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags
  2021-07-12 15:39 ` [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags Richard Henderson
  2021-07-17 17:41   ` Peter Maydell
@ 2021-07-17 17:42   ` Alex Bennée
  2021-07-17 18:42     ` Richard Henderson
  1 sibling, 1 reply; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:42 UTC (permalink / raw)
  To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug


Richard Henderson <richard.henderson@linaro.org> writes:

> Exchange the test in translator_use_goto_tb for CF_NO_GOTO_TB,
> and the test in tb_gen_code for setting CF_COUNT_MASK to 1.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  accel/tcg/cpu-exec.c      | 8 +++++++-
>  accel/tcg/translate-all.c | 2 +-
>  accel/tcg/translator.c    | 2 +-
>  3 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 70ea3c7d68..2206c463f5 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -149,7 +149,13 @@ uint32_t curr_cflags(CPUState *cpu)
>  {
>      uint32_t cflags = cpu->tcg_cflags;
>  
> -    if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
> +    /*
> +     * For singlestep and -d nochain, suppress goto_tb so that
> +     * we can log -d cpu,exec after every TB.
> +     */
> +    if (singlestep) {

Hmm we are testing a magic global here and looking at
cpu->singlestep_enabled lower down. We have a transient singlestep which
is turned on an off via cpu->singlestep_enabled and a global as a debug
option. Can we rationalise it further?

> +        cflags |= CF_NO_GOTO_TB | 1;
> +    } else if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
>          cflags |= CF_NO_GOTO_TB;
>      }
>  
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index 997e44c73b..491c1a56b2 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -1432,7 +1432,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>      }
>      QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
>  
> -    if (cpu->singlestep_enabled || singlestep) {
> +    if (cpu->singlestep_enabled) {
>          max_insns = 1;
>      }
>  
> diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
> index 2ea5a74f30..a59eb7c11b 100644
> --- a/accel/tcg/translator.c
> +++ b/accel/tcg/translator.c
> @@ -39,7 +39,7 @@ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
>      }
>  
>      /* Suppress goto_tb in the case of single-steping.  */
> -    if (db->singlestep_enabled || singlestep) {
> +    if (db->singlestep_enabled) {
>          return false;
>      }


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic
  2021-07-12 15:40 ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic Richard Henderson
@ 2021-07-17 17:43   ` Peter Maydell
  2021-07-17 19:03     ` Richard Henderson
  2021-07-17 17:47   ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB,PTR} " Alex Bennée
  1 sibling, 1 reply; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:43 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
	Philippe Mathieu-Daudé

On Mon, 12 Jul 2021 at 16:46, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Request that the one TB returns immediately, so that
> we release the exclusive lock as soon as possible.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  accel/tcg/cpu-exec.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 2206c463f5..5bb099174f 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -330,8 +330,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
>      CPUArchState *env = (CPUArchState *)cpu->env_ptr;
>      TranslationBlock *tb;
>      target_ulong cs_base, pc;
> -    uint32_t flags;
> -    uint32_t cflags = (curr_cflags(cpu) & ~CF_PARALLEL) | 1;
> +    uint32_t flags, cflags;
>      int tb_exit;
>
>      if (sigsetjmp(cpu->jmp_env, 0) == 0) {
> @@ -341,8 +340,14 @@ void cpu_exec_step_atomic(CPUState *cpu)
>          cpu->running = true;
>
>          cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
> -        tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
>
> +        cflags = curr_cflags(cpu);
> +        /* Execute in a serial context. */
> +        cflags &= ~CF_PARALLEL;
> +        /* After 1 insn, return and release the exclusive lock. */
> +        cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR | 1;
> +
> +        tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
>          if (tb == NULL) {
>              mmap_lock();
>              tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);

So previously we would have executed possibly a chain of TBs
before releasing the lock, and now we definitely execute just one?
(I guess the execute-a-chain case is unlikely given the TB
only has one insn and we know it's an exclusive insn...)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 07/10] accel/tcg: Move cflags lookup into tb_find
  2021-07-12 15:40 ` [PATCH v2 07/10] accel/tcg: Move cflags lookup into tb_find Richard Henderson
@ 2021-07-17 17:45   ` Peter Maydell
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:45 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
	Philippe Mathieu-Daudé

On Mon, 12 Jul 2021 at 16:42, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> We will shortly require the guest pc for computing cflags,
> so move the choice just after cpu_get_tb_cpu_state.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  accel/tcg/cpu-exec.c | 34 +++++++++++++++++-----------------
>  1 file changed, 17 insertions(+), 17 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB,PTR} in cpu_exec_step_atomic
  2021-07-12 15:40 ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic Richard Henderson
  2021-07-17 17:43   ` Peter Maydell
@ 2021-07-17 17:47   ` Alex Bennée
  1 sibling, 0 replies; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:47 UTC (permalink / raw)
  To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug


Richard Henderson <richard.henderson@linaro.org> writes:

> Request that the one TB returns immediately, so that
> we release the exclusive lock as soon as possible.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 09/10] accel/tcg: Hoist tb_cflags to a local in translator_loop
  2021-07-12 15:40 ` [PATCH v2 09/10] accel/tcg: Hoist tb_cflags to a local in translator_loop Richard Henderson
@ 2021-07-17 17:50   ` Alex Bennée
  0 siblings, 0 replies; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:50 UTC (permalink / raw)
  To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug


Richard Henderson <richard.henderson@linaro.org> writes:

> The access internal to tb_cflags() is atomic.
> Avoid re-reading it as such for the multiple uses.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
  2021-07-12 15:40 ` [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check Richard Henderson
@ 2021-07-17 17:52   ` Peter Maydell
  2021-07-17 19:41     ` Richard Henderson
  0 siblings, 1 reply; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:52 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
	Philippe Mathieu-Daudé

On Mon, 12 Jul 2021 at 16:48, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> We don't need the whole CPUBreakpoint structure in the check,
> only the flags.  Return the instruction length to consolidate
> the adjustment of db->pc_next.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


> diff --git a/target/avr/translate.c b/target/avr/translate.c
> index 8237a03c23..73ff467926 100644
> --- a/target/avr/translate.c
> +++ b/target/avr/translate.c
> @@ -2944,13 +2944,13 @@ static void avr_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
>      tcg_gen_insn_start(ctx->npc);
>  }
>
> -static bool avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
> -                                    const CPUBreakpoint *bp)
> +static int avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
> +                                   int bp_flags)
>  {
>      DisasContext *ctx = container_of(dcbase, DisasContext, base);
>
>      gen_breakpoint(ctx);
> -    return true;
> +    return 2; /* minimum instruction length */

Here we weren't advancing pc_next at all, and now we do.

> diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
> index 47c967acbf..c7b9d813c2 100644
> --- a/target/mips/tcg/translate.c
> +++ b/target/mips/tcg/translate.c
> @@ -16190,22 +16190,16 @@ static void mips_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
>                         ctx->btarget);
>  }
>
> -static bool mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
> -                                     const CPUBreakpoint *bp)
> +static int mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
> +                                    int bp_flags)
>  {
>      DisasContext *ctx = container_of(dcbase, DisasContext, base);
>
>      save_cpu_state(ctx, 1);
>      ctx->base.is_jmp = DISAS_NORETURN;
>      gen_helper_raise_exception_debug(cpu_env);
> -    /*
> -     * The address covered by the breakpoint must be included in
> -     * [tb->pc, tb->pc + tb->size) in order to for it to be
> -     * properly cleared -- thus we increment the PC here so that
> -     * the logic setting tb->size below does the right thing.
> -     */
> -    ctx->base.pc_next += 4;
> -    return true;
> +
> +    return 2; /* minimum instruction length */
>  }

Here we were advancing by 4 and now advance by 2.

> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index deda0c8a44..8a6bc58572 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -961,20 +961,15 @@ static void riscv_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
>      tcg_gen_insn_start(ctx->base.pc_next);
>  }
>
> -static bool riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
> -                                      const CPUBreakpoint *bp)
> +static int riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
> +                                     int bp_flags)
>  {
>      DisasContext *ctx = container_of(dcbase, DisasContext, base);
>
>      tcg_gen_movi_tl(cpu_pc, ctx->base.pc_next);
>      ctx->base.is_jmp = DISAS_NORETURN;
>      gen_exception_debug();
> -    /* The address covered by the breakpoint must be included in
> -       [tb->pc, tb->pc + tb->size) in order to for it to be
> -       properly cleared -- thus we increment the PC here so that
> -       the logic setting tb->size below does the right thing.  */
> -    ctx->base.pc_next += 4;
> -    return true;
> +    return 2; /* minimum instruction length */
>  }

Ditto.

If these are intentional changes (are they bugfixes?) they should be in a
separate patch.

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags
  2021-07-12 15:40 ` [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags Richard Henderson
@ 2021-07-17 17:58   ` Peter Maydell
  2021-07-17 19:20     ` Richard Henderson
  0 siblings, 1 reply; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:58 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
	Philippe Mathieu-Daudé

On Mon, 12 Jul 2021 at 16:49, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Having this data in cflags means that hashing takes care
> of selecting a TB with or without exceptions built in.
> Which means that we no longer need to flush all TBs.
>
> This does require that we single-step while we're within a page
> that contains a breakpoint, so it's not yet ideal, but should be
> an improvement over some corner-case slowdowns.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/404
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  include/exec/exec-all.h   |  7 ++++
>  accel/tcg/cpu-exec.c      | 68 ++++++++++++++++++++++++++++++-
>  accel/tcg/translate-all.c |  4 --
>  accel/tcg/translator.c    | 85 +++++++++++++++++++++------------------
>  cpu.c                     | 24 -----------
>  5 files changed, 119 insertions(+), 69 deletions(-)
>
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 6873cce8df..7ab2578f71 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -502,9 +502,16 @@ struct TranslationBlock {
>  #define CF_USE_ICOUNT    0x00020000
>  #define CF_INVALID       0x00040000 /* TB is stale. Set with @jmp_lock held */
>  #define CF_PARALLEL      0x00080000 /* Generate code for a parallel context */
> +#define CF_BP_MASK       0x00300000 /* See below */
> +#define CF_BP_SHIFT      20
>  #define CF_CLUSTER_MASK  0xff000000 /* Top 8 bits are cluster ID */
>  #define CF_CLUSTER_SHIFT 24
>
> +#define CF_BP_NONE       (0 << CF_BP_SHIFT) /* TB does not interact with BPs */
> +#define CF_BP_SSTEP      (1 << CF_BP_SHIFT) /* gdbstub single-step in effect */
> +#define CF_BP_GDB        (2 << CF_BP_SHIFT) /* gdbstub breakpoint at tb->pc */
> +#define CF_BP_CPU        (3 << CF_BP_SHIFT) /* arch breakpoint at tb->pc */
> +
>      /* Per-vCPU dynamic tracing state used to generate this TB */
>      uint32_t trace_vcpu_dstate;
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 4d043a11aa..179a425ece 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -222,6 +222,65 @@ static inline void log_cpu_exec(target_ulong pc, CPUState *cpu,
>      }
>  }
>
> +static uint32_t cflags_for_breakpoints(CPUState *cpu, target_ulong pc,
> +                                       uint32_t cflags)
> +{
> +    uint32_t bflags = 0;
> +
> +    if (unlikely(cpu->singlestep_enabled)) {
> +        bflags = CF_BP_SSTEP;
> +    } else {

Won't this ignore breakpoints when singlestepping ?

-- PMM


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
  2021-07-17 17:30   ` Peter Maydell
@ 2021-07-17 18:31     ` Richard Henderson
  0 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 18:31 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
	Philippe Mathieu-Daudé

On 7/17/21 10:30 AM, Peter Maydell wrote:
> On Mon, 12 Jul 2021 at 16:42, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> The space reserved for CF_COUNT_MASK was overly large.
>> Reduce to free up cflags bits and eliminate an extra test.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   include/exec/exec-all.h   | 4 +++-
>>   accel/tcg/translate-all.c | 5 ++---
>>   2 files changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
>> index 754f4130c9..dfe82ed19c 100644
>> --- a/include/exec/exec-all.h
>> +++ b/include/exec/exec-all.h
>> @@ -492,7 +492,9 @@ struct TranslationBlock {
>>       target_ulong cs_base; /* CS base for this block */
>>       uint32_t flags; /* flags defining in which context the code was generated */
>>       uint32_t cflags;    /* compile flags */
>> -#define CF_COUNT_MASK  0x00007fff
>> +
>> +/* Note that TCG_MAX_INSNS is 512; we validate this match elsewhere. */
>> +#define CF_COUNT_MASK  0x000001ff
>>   #define CF_LAST_IO     0x00008000 /* Last insn may be an IO access.  */
>>   #define CF_MEMI_ONLY   0x00010000 /* Only instrument memory ops */
>>   #define CF_USE_ICOUNT  0x00020000
>> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
>> index 4df26de858..997e44c73b 100644
>> --- a/accel/tcg/translate-all.c
>> +++ b/accel/tcg/translate-all.c
>> @@ -1430,9 +1430,8 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>>       if (max_insns == 0) {
>>           max_insns = CF_COUNT_MASK;
>>       }
>> -    if (max_insns > TCG_MAX_INSNS) {
>> -        max_insns = TCG_MAX_INSNS;
>> -    }
>> +    QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
>> +
> 
> Previously we would allow max_insns = TCG_MAX_INSNS (512).
> Now we only allow it to be 511...

Hmm.  Well, 0 is supposed to map to "max", currently written as

     max_insns = cflags & CF_COUNT_MASK;
     if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
     }

I could just as easily map 0 -> TCG_MAX_INSNS instead.


r~



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
  2021-07-17 17:39   ` Alex Bennée
@ 2021-07-17 18:38     ` Richard Henderson
  2021-07-17 19:48       ` Alex Bennée
  0 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 18:38 UTC (permalink / raw)
  To: Alex Bennée; +Cc: mark.cave-ayland, qemu-devel, f4bug

On 7/17/21 10:39 AM, Alex Bennée wrote:
> 
> Richard Henderson <richard.henderson@linaro.org> writes:
> 
>> The purpose of suppressing goto_ptr from -d nochain had been
>> to return to the main loop so that -d cpu would be recognized.
> 
> Hmm is it though? I've always treated it as ensuring we always come out
> into the main loop (which is helpful for debugging).

What's helpful for debugging wrt the main loop beyond logging?


r~

> 
>> But we now include -d cpu logging in helper_lookup_tb_ptr so
>> there is no need to exclude goto_ptr.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   accel/tcg/cpu-exec.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>> index d3232d5764..70ea3c7d68 100644
>> --- a/accel/tcg/cpu-exec.c
>> +++ b/accel/tcg/cpu-exec.c
>> @@ -150,7 +150,7 @@ uint32_t curr_cflags(CPUState *cpu)
>>       uint32_t cflags = cpu->tcg_cflags;
>>   
>>       if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
>> -        cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
>> +        cflags |= CF_NO_GOTO_TB;
>>       }
>>   
>>       return cflags;
> 
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags
  2021-07-17 17:42   ` Alex Bennée
@ 2021-07-17 18:42     ` Richard Henderson
  2021-07-17 19:51       ` Peter Maydell
  0 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 18:42 UTC (permalink / raw)
  To: Alex Bennée; +Cc: mark.cave-ayland, qemu-devel, f4bug

On 7/17/21 10:42 AM, Alex Bennée wrote:
> 
> Richard Henderson <richard.henderson@linaro.org> writes:
> 
>> Exchange the test in translator_use_goto_tb for CF_NO_GOTO_TB,
>> and the test in tb_gen_code for setting CF_COUNT_MASK to 1.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   accel/tcg/cpu-exec.c      | 8 +++++++-
>>   accel/tcg/translate-all.c | 2 +-
>>   accel/tcg/translator.c    | 2 +-
>>   3 files changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>> index 70ea3c7d68..2206c463f5 100644
>> --- a/accel/tcg/cpu-exec.c
>> +++ b/accel/tcg/cpu-exec.c
>> @@ -149,7 +149,13 @@ uint32_t curr_cflags(CPUState *cpu)
>>   {
>>       uint32_t cflags = cpu->tcg_cflags;
>>   
>> -    if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
>> +    /*
>> +     * For singlestep and -d nochain, suppress goto_tb so that
>> +     * we can log -d cpu,exec after every TB.
>> +     */
>> +    if (singlestep) {
> 
> Hmm we are testing a magic global here and looking at
> cpu->singlestep_enabled lower down. We have a transient singlestep which
> is turned on an off via cpu->singlestep_enabled and a global as a debug
> option. Can we rationalise it further?

Not sure what you're asking.

cpu->singlestep_enabled raises a debug exception after one insn, whereas singlestep merely 
exits the tb after one insn.

What is it that you want me to rationalize?


r~


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic
  2021-07-17 17:43   ` Peter Maydell
@ 2021-07-17 19:03     ` Richard Henderson
  0 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 19:03 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
	Philippe Mathieu-Daudé

On 7/17/21 10:43 AM, Peter Maydell wrote:
> On Mon, 12 Jul 2021 at 16:46, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> Request that the one TB returns immediately, so that
>> we release the exclusive lock as soon as possible.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   accel/tcg/cpu-exec.c | 11 ++++++++---
>>   1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>> index 2206c463f5..5bb099174f 100644
>> --- a/accel/tcg/cpu-exec.c
>> +++ b/accel/tcg/cpu-exec.c
>> @@ -330,8 +330,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
>>       CPUArchState *env = (CPUArchState *)cpu->env_ptr;
>>       TranslationBlock *tb;
>>       target_ulong cs_base, pc;
>> -    uint32_t flags;
>> -    uint32_t cflags = (curr_cflags(cpu) & ~CF_PARALLEL) | 1;
>> +    uint32_t flags, cflags;
>>       int tb_exit;
>>
>>       if (sigsetjmp(cpu->jmp_env, 0) == 0) {
>> @@ -341,8 +340,14 @@ void cpu_exec_step_atomic(CPUState *cpu)
>>           cpu->running = true;
>>
>>           cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
>> -        tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
>>
>> +        cflags = curr_cflags(cpu);
>> +        /* Execute in a serial context. */
>> +        cflags &= ~CF_PARALLEL;
>> +        /* After 1 insn, return and release the exclusive lock. */
>> +        cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR | 1;
>> +
>> +        tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
>>           if (tb == NULL) {
>>               mmap_lock();
>>               tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
> 
> So previously we would have executed possibly a chain of TBs
> before releasing the lock, and now we definitely execute just one?

Correct.

> (I guess the execute-a-chain case is unlikely given the TB
> only has one insn and we know it's an exclusive insn...)

I think it's actually likely.  While the tb would definitely end after one insn, we had 
passed nothing down that would lead to returning to the main loop.


r~

> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> 
> thanks
> -- PMM
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags
  2021-07-17 17:58   ` Peter Maydell
@ 2021-07-17 19:20     ` Richard Henderson
  0 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 19:20 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
	Philippe Mathieu-Daudé

On 7/17/21 10:58 AM, Peter Maydell wrote:
>> +static uint32_t cflags_for_breakpoints(CPUState *cpu, target_ulong pc,
>> +                                       uint32_t cflags)
>> +{
>> +    uint32_t bflags = 0;
>> +
>> +    if (unlikely(cpu->singlestep_enabled)) {
>> +        bflags = CF_BP_SSTEP;
>> +    } else {
> 
> Won't this ignore breakpoints when singlestepping ?

Single-step already has priority over other breakpoints:

>          /* Pass breakpoint hits to target for further processing */
> -        if (!db->singlestep_enabled
> -            && unlikely(!QTAILQ_EMPTY(&cpu->breakpoints))) {


r~


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
  2021-07-17 17:52   ` Peter Maydell
@ 2021-07-17 19:41     ` Richard Henderson
  0 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 19:41 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
	Philippe Mathieu-Daudé

On 7/17/21 10:52 AM, Peter Maydell wrote:
> On Mon, 12 Jul 2021 at 16:48, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> We don't need the whole CPUBreakpoint structure in the check,
>> only the flags.  Return the instruction length to consolidate
>> the adjustment of db->pc_next.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> 
> 
>> diff --git a/target/avr/translate.c b/target/avr/translate.c
>> index 8237a03c23..73ff467926 100644
>> --- a/target/avr/translate.c
>> +++ b/target/avr/translate.c
>> @@ -2944,13 +2944,13 @@ static void avr_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
>>       tcg_gen_insn_start(ctx->npc);
>>   }
>>
>> -static bool avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
>> -                                    const CPUBreakpoint *bp)
>> +static int avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
>> +                                   int bp_flags)
>>   {
>>       DisasContext *ctx = container_of(dcbase, DisasContext, base);
>>
>>       gen_breakpoint(ctx);
>> -    return true;
>> +    return 2; /* minimum instruction length */
> 
> Here we weren't advancing pc_next at all, and now we do.
> 
>> diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
>> index 47c967acbf..c7b9d813c2 100644
>> --- a/target/mips/tcg/translate.c
>> +++ b/target/mips/tcg/translate.c
>> @@ -16190,22 +16190,16 @@ static void mips_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
>>                          ctx->btarget);
>>   }
>>
>> -static bool mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
>> -                                     const CPUBreakpoint *bp)
>> +static int mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
>> +                                    int bp_flags)
>>   {
>>       DisasContext *ctx = container_of(dcbase, DisasContext, base);
>>
>>       save_cpu_state(ctx, 1);
>>       ctx->base.is_jmp = DISAS_NORETURN;
>>       gen_helper_raise_exception_debug(cpu_env);
>> -    /*
>> -     * The address covered by the breakpoint must be included in
>> -     * [tb->pc, tb->pc + tb->size) in order to for it to be
>> -     * properly cleared -- thus we increment the PC here so that
>> -     * the logic setting tb->size below does the right thing.
>> -     */
>> -    ctx->base.pc_next += 4;
>> -    return true;
>> +
>> +    return 2; /* minimum instruction length */
>>   }
> 
> Here we were advancing by 4 and now advance by 2.
> 
>> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
>> index deda0c8a44..8a6bc58572 100644
>> --- a/target/riscv/translate.c
>> +++ b/target/riscv/translate.c
>> @@ -961,20 +961,15 @@ static void riscv_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
>>       tcg_gen_insn_start(ctx->base.pc_next);
>>   }
>>
>> -static bool riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
>> -                                      const CPUBreakpoint *bp)
>> +static int riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
>> +                                     int bp_flags)
>>   {
>>       DisasContext *ctx = container_of(dcbase, DisasContext, base);
>>
>>       tcg_gen_movi_tl(cpu_pc, ctx->base.pc_next);
>>       ctx->base.is_jmp = DISAS_NORETURN;
>>       gen_exception_debug();
>> -    /* The address covered by the breakpoint must be included in
>> -       [tb->pc, tb->pc + tb->size) in order to for it to be
>> -       properly cleared -- thus we increment the PC here so that
>> -       the logic setting tb->size below does the right thing.  */
>> -    ctx->base.pc_next += 4;
>> -    return true;
>> +    return 2; /* minimum instruction length */
>>   }
> 
> Ditto.
> 
> If these are intentional changes (are they bugfixes?) they should be in a
> separate patch.

Yes, they are bug fixes.

r~

> 
> Otherwise
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> 
> thanks
> -- PMM
> 



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
  2021-07-17 18:38     ` Richard Henderson
@ 2021-07-17 19:48       ` Alex Bennée
  0 siblings, 0 replies; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 19:48 UTC (permalink / raw)
  To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug


Richard Henderson <richard.henderson@linaro.org> writes:

> On 7/17/21 10:39 AM, Alex Bennée wrote:
>> Richard Henderson <richard.henderson@linaro.org> writes:
>> 
>>> The purpose of suppressing goto_ptr from -d nochain had been
>>> to return to the main loop so that -d cpu would be recognized.
>> Hmm is it though? I've always treated it as ensuring we always come
>> out
>> into the main loop (which is helpful for debugging).
>
> What's helpful for debugging wrt the main loop beyond logging?

Usually if I rr a bug I reverse continue to the top of the loop for my
re-run. I guess we can put breakpoints elsewhere it's just another place
to remember.

>
>
> r~
>
>> 
>>> But we now include -d cpu logging in helper_lookup_tb_ptr so
>>> there is no need to exclude goto_ptr.
>>>
>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>> ---
>>>   accel/tcg/cpu-exec.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>>> index d3232d5764..70ea3c7d68 100644
>>> --- a/accel/tcg/cpu-exec.c
>>> +++ b/accel/tcg/cpu-exec.c
>>> @@ -150,7 +150,7 @@ uint32_t curr_cflags(CPUState *cpu)
>>>       uint32_t cflags = cpu->tcg_cflags;
>>>         if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
>>> -        cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
>>> +        cflags |= CF_NO_GOTO_TB;
>>>       }
>>>         return cflags;
>> 


-- 
Alex Bennée


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags
  2021-07-17 18:42     ` Richard Henderson
@ 2021-07-17 19:51       ` Peter Maydell
  0 siblings, 0 replies; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 19:51 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Mark Cave-Ayland, Alex Bennée, QEMU Developers,
	Philippe Mathieu-Daudé

On Sat, 17 Jul 2021 at 19:43, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 7/17/21 10:42 AM, Alex Bennée wrote:
> > Hmm we are testing a magic global here and looking at
> > cpu->singlestep_enabled lower down. We have a transient singlestep which
> > is turned on an off via cpu->singlestep_enabled and a global as a debug
> > option. Can we rationalise it further?
>
> Not sure what you're asking.
>
> cpu->singlestep_enabled raises a debug exception after one insn, whereas singlestep merely
> exits the tb after one insn.

We really should rename 'singlestep' to 'one_insn_per_tb' or something,
because it's continually confusing...

-- PMM


^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2021-07-17 19:53 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
2021-07-12 15:39 ` [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
2021-07-17 17:30   ` Peter Maydell
2021-07-17 18:31     ` Richard Henderson
2021-07-17 17:33   ` Alex Bennée
2021-07-12 15:39 ` [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c Richard Henderson
2021-07-17 17:32   ` Peter Maydell
2021-07-17 17:34   ` Alex Bennée
2021-07-12 15:39 ` [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR Richard Henderson
2021-07-17 17:36   ` Alex Bennée
2021-07-17 17:36   ` Peter Maydell
2021-07-12 15:39 ` [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain Richard Henderson
2021-07-17 17:39   ` Peter Maydell
2021-07-17 17:39   ` Alex Bennée
2021-07-17 18:38     ` Richard Henderson
2021-07-17 19:48       ` Alex Bennée
2021-07-12 15:39 ` [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags Richard Henderson
2021-07-17 17:41   ` Peter Maydell
2021-07-17 17:42   ` Alex Bennée
2021-07-17 18:42     ` Richard Henderson
2021-07-17 19:51       ` Peter Maydell
2021-07-12 15:40 ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic Richard Henderson
2021-07-17 17:43   ` Peter Maydell
2021-07-17 19:03     ` Richard Henderson
2021-07-17 17:47   ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB,PTR} " Alex Bennée
2021-07-12 15:40 ` [PATCH v2 07/10] accel/tcg: Move cflags lookup into tb_find Richard Henderson
2021-07-17 17:45   ` Peter Maydell
2021-07-12 15:40 ` [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check Richard Henderson
2021-07-17 17:52   ` Peter Maydell
2021-07-17 19:41     ` Richard Henderson
2021-07-12 15:40 ` [PATCH v2 09/10] accel/tcg: Hoist tb_cflags to a local in translator_loop Richard Henderson
2021-07-17 17:50   ` Alex Bennée
2021-07-12 15:40 ` [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags Richard Henderson
2021-07-17 17:58   ` Peter Maydell
2021-07-17 19:20     ` Richard Henderson
2021-07-12 18:20 ` [PATCH v2 00/10] tcg: breakpoint reorg Mark Cave-Ayland
2021-07-17 17:16 ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.