qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/17] tcg: breakpoint reorg
@ 2021-07-01 15:25 Richard Henderson
  2021-07-01 15:25 ` [PATCH 01/17] target/i386: Use cpu_breakpoint_test in breakpoint_handler Richard Henderson
                   ` (17 more replies)
  0 siblings, 18 replies; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

Based-on: <20210630183226.3290849-1-richard.henderson@linaro.org>
("[PATCH v2 00/28] accel/tcg: Introduce translator_use_goto_tb")

This is my attempt at fixing #404 ("windows xp boot takes much longer...").

I don't actually have windows xp available myself, so I don't know
if this has worked, really.  I can still boot windows 7, but from
the lack of tracepoint firings I guess it doesn't play any silly
games with breakpoints.

This scheme is not without its drawbacks.  In exchange for no bookkeeping
and invalidation whatsoever, other code on the same page as an active
breakpoint runs one insn per tb, doing indirect chaining through
helper_lookup_tb_ptr to see if we hit the breakpoint.

The minor testing that I did seemed fast enough though, with gdb
responding quickly.  So before I go off and try to complicate
things again with extra bookkeeping, I thought I'd get some feedback.


r~


Richard Henderson (17):
  target/i386: Use cpu_breakpoint_test in breakpoint_handler
  accel/tcg: Move helper_lookup_tb_ptr to cpu-exec.c
  accel/tcg: Move tb_lookup to cpu-exec.c
  accel/tcg: Split out log_cpu_exec
  accel/tcg: Log tb->cflags with -d exec
  tcg: Remove TCG_TARGET_HAS_goto_ptr
  accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
  accel/tcg: Move curr_cflags into cpu-exec.c
  accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
  accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
  accel/tcg: Handle -singlestep in curr_cflags
  accel/tcg: Use CF_NO_GOTO_{TB,PTR} in cpu_exec_step_atomic
  accel/tcg: Move cflags lookup into tb_find
  accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
  accel/tcg: Hoist tb_cflags to a local in translator_loop
  accel/tcg: Encode breakpoint info into tb->cflags
  cpu: Add breakpoint tracepoints

 accel/tcg/tb-lookup.h               |  49 ------
 include/exec/exec-all.h             |  30 ++--
 include/exec/translator.h           |  17 +-
 include/tcg/tcg-opc.h               |   3 +-
 tcg/aarch64/tcg-target.h            |   1 -
 tcg/arm/tcg-target.h                |   1 -
 tcg/i386/tcg-target.h               |   1 -
 tcg/mips/tcg-target.h               |   1 -
 tcg/ppc/tcg-target.h                |   1 -
 tcg/riscv/tcg-target.h              |   1 -
 tcg/s390/tcg-target.h               |   1 -
 tcg/sparc/tcg-target.h              |   1 -
 tcg/tci/tcg-target.h                |   1 -
 accel/tcg/cpu-exec.c                | 238 +++++++++++++++++++++++-----
 accel/tcg/tcg-runtime.c             |  22 ---
 accel/tcg/translate-all.c           |   7 +-
 accel/tcg/translator.c              |  79 ++++++---
 cpu.c                               |  35 +---
 target/alpha/translate.c            |  12 +-
 target/arm/translate-a64.c          |  14 +-
 target/arm/translate.c              |  20 +--
 target/avr/translate.c              |   6 +-
 target/cris/translate.c             |  14 +-
 target/hexagon/translate.c          |  13 +-
 target/hppa/translate.c             |   7 +-
 target/i386/tcg/sysemu/bpt_helper.c |  12 +-
 target/i386/tcg/translate.c         |  15 +-
 target/m68k/translate.c             |  14 +-
 target/microblaze/translate.c       |  14 +-
 target/mips/tcg/translate.c         |  14 +-
 target/nios2/translate.c            |  13 +-
 target/openrisc/translate.c         |  11 +-
 target/ppc/translate.c              |  13 +-
 target/riscv/translate.c            |  11 +-
 target/rx/translate.c               |   8 +-
 target/s390x/translate.c            |  12 +-
 target/sh4/translate.c              |  12 +-
 target/sparc/translate.c            |   9 +-
 target/tricore/translate.c          |  13 +-
 target/xtensa/translate.c           |  12 +-
 tcg/tcg-op.c                        |  28 ++--
 tcg/tcg.c                           |   8 +-
 trace-events                        |   5 +
 43 files changed, 386 insertions(+), 413 deletions(-)
 delete mode 100644 accel/tcg/tb-lookup.h

-- 
2.25.1



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 01/17] target/i386: Use cpu_breakpoint_test in breakpoint_handler
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 15:25 ` [PATCH 02/17] accel/tcg: Move helper_lookup_tb_ptr to cpu-exec.c Richard Henderson
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

The loop is performing a simple boolean test for the existence
of a BP_CPU breakpoint at EIP.  Plus it gets the iteration wrong,
if we happen to have a BP_GDB breakpoint at the same address.

We have a function for this: cpu_breakpoint_test.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/i386/tcg/sysemu/bpt_helper.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/target/i386/tcg/sysemu/bpt_helper.c b/target/i386/tcg/sysemu/bpt_helper.c
index 9bdf7e170b..f1fb479ad9 100644
--- a/target/i386/tcg/sysemu/bpt_helper.c
+++ b/target/i386/tcg/sysemu/bpt_helper.c
@@ -210,7 +210,6 @@ void breakpoint_handler(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
     CPUX86State *env = &cpu->env;
-    CPUBreakpoint *bp;
 
     if (cs->watchpoint_hit) {
         if (cs->watchpoint_hit->flags & BP_CPU) {
@@ -222,14 +221,9 @@ void breakpoint_handler(CPUState *cs)
             }
         }
     } else {
-        QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-            if (bp->pc == env->eip) {
-                if (bp->flags & BP_CPU) {
-                    check_hw_breakpoints(env, true);
-                    raise_exception(env, EXCP01_DB);
-                }
-                break;
-            }
+        if (cpu_breakpoint_test(cs, env->eip, BP_CPU)) {
+            check_hw_breakpoints(env, true);
+            raise_exception(env, EXCP01_DB);
         }
     }
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 02/17] accel/tcg: Move helper_lookup_tb_ptr to cpu-exec.c
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
  2021-07-01 15:25 ` [PATCH 01/17] target/i386: Use cpu_breakpoint_test in breakpoint_handler Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 16:33   ` Philippe Mathieu-Daudé
  2021-07-01 15:25 ` [PATCH 03/17] accel/tcg: Move tb_lookup " Richard Henderson
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

This will allow additional code sharing.
No functional change.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c    | 30 ++++++++++++++++++++++++++++++
 accel/tcg/tcg-runtime.c | 22 ----------------------
 2 files changed, 30 insertions(+), 22 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index ad1279d2ed..fb6668606f 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -38,6 +38,7 @@
 #include "exec/cpu-all.h"
 #include "sysemu/cpu-timers.h"
 #include "sysemu/replay.h"
+#include "exec/helper-proto.h"
 #include "tb-hash.h"
 #include "tb-lookup.h"
 #include "tb-context.h"
@@ -145,6 +146,35 @@ static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
 }
 #endif /* CONFIG USER ONLY */
 
+/**
+ * helper_lookup_tb_ptr: quick check for next tb
+ * @env: current cpu state
+ *
+ * Look for an existing TB matching the current cpu state.
+ * If found, return the code pointer.  If not found, return
+ * the tcg epilogue so that we return into cpu_tb_exec.
+ */
+const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
+{
+    CPUState *cpu = env_cpu(env);
+    TranslationBlock *tb;
+    target_ulong cs_base, pc;
+    uint32_t flags;
+
+    cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
+
+    tb = tb_lookup(cpu, pc, cs_base, flags, curr_cflags(cpu));
+    if (tb == NULL) {
+        return tcg_code_gen_epilogue;
+    }
+    qemu_log_mask_and_addr(CPU_LOG_EXEC, pc,
+                           "Chain %d: %p ["
+                           TARGET_FMT_lx "/" TARGET_FMT_lx "/%#x] %s\n",
+                           cpu->cpu_index, tb->tc.ptr, cs_base, pc, flags,
+                           lookup_symbol(pc));
+    return tb->tc.ptr;
+}
+
 /* Execute a TB, and fix up the CPU state afterwards if necessary */
 /*
  * Disable CFI checks.
diff --git a/accel/tcg/tcg-runtime.c b/accel/tcg/tcg-runtime.c
index 66ac830e2f..e4e030043f 100644
--- a/accel/tcg/tcg-runtime.c
+++ b/accel/tcg/tcg-runtime.c
@@ -30,7 +30,6 @@
 #include "disas/disas.h"
 #include "exec/log.h"
 #include "tcg/tcg.h"
-#include "tb-lookup.h"
 
 /* 32-bit helpers */
 
@@ -145,27 +144,6 @@ uint64_t HELPER(ctpop_i64)(uint64_t arg)
     return ctpop64(arg);
 }
 
-const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
-{
-    CPUState *cpu = env_cpu(env);
-    TranslationBlock *tb;
-    target_ulong cs_base, pc;
-    uint32_t flags;
-
-    cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
-
-    tb = tb_lookup(cpu, pc, cs_base, flags, curr_cflags(cpu));
-    if (tb == NULL) {
-        return tcg_code_gen_epilogue;
-    }
-    qemu_log_mask_and_addr(CPU_LOG_EXEC, pc,
-                           "Chain %d: %p ["
-                           TARGET_FMT_lx "/" TARGET_FMT_lx "/%#x] %s\n",
-                           cpu->cpu_index, tb->tc.ptr, cs_base, pc, flags,
-                           lookup_symbol(pc));
-    return tb->tc.ptr;
-}
-
 void HELPER(exit_atomic)(CPUArchState *env)
 {
     cpu_loop_exit_atomic(env_cpu(env), GETPC());
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 03/17] accel/tcg: Move tb_lookup to cpu-exec.c
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
  2021-07-01 15:25 ` [PATCH 01/17] target/i386: Use cpu_breakpoint_test in breakpoint_handler Richard Henderson
  2021-07-01 15:25 ` [PATCH 02/17] accel/tcg: Move helper_lookup_tb_ptr to cpu-exec.c Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 16:34   ` Philippe Mathieu-Daudé
  2021-07-01 15:25 ` [PATCH 04/17] accel/tcg: Split out log_cpu_exec Richard Henderson
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

Now that we've moved helper_lookup_tb_ptr, the only user
of tb-lookup.h is cpu-exec.c; merge the contents in.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/tb-lookup.h | 49 -------------------------------------------
 accel/tcg/cpu-exec.c  | 31 ++++++++++++++++++++++++++-
 2 files changed, 30 insertions(+), 50 deletions(-)
 delete mode 100644 accel/tcg/tb-lookup.h

diff --git a/accel/tcg/tb-lookup.h b/accel/tcg/tb-lookup.h
deleted file mode 100644
index 9c9e0079da..0000000000
--- a/accel/tcg/tb-lookup.h
+++ /dev/null
@@ -1,49 +0,0 @@
-/*
- * Copyright (C) 2017, Emilio G. Cota <cota@braap.org>
- *
- * License: GNU GPL, version 2 or later.
- *   See the COPYING file in the top-level directory.
- */
-#ifndef EXEC_TB_LOOKUP_H
-#define EXEC_TB_LOOKUP_H
-
-#ifdef NEED_CPU_H
-#include "cpu.h"
-#else
-#include "exec/poison.h"
-#endif
-
-#include "exec/exec-all.h"
-#include "tb-hash.h"
-
-/* Might cause an exception, so have a longjmp destination ready */
-static inline TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
-                                          target_ulong cs_base,
-                                          uint32_t flags, uint32_t cflags)
-{
-    TranslationBlock *tb;
-    uint32_t hash;
-
-    /* we should never be trying to look up an INVALID tb */
-    tcg_debug_assert(!(cflags & CF_INVALID));
-
-    hash = tb_jmp_cache_hash_func(pc);
-    tb = qatomic_rcu_read(&cpu->tb_jmp_cache[hash]);
-
-    if (likely(tb &&
-               tb->pc == pc &&
-               tb->cs_base == cs_base &&
-               tb->flags == flags &&
-               tb->trace_vcpu_dstate == *cpu->trace_dstate &&
-               tb_cflags(tb) == cflags)) {
-        return tb;
-    }
-    tb = tb_htable_lookup(cpu, pc, cs_base, flags, cflags);
-    if (tb == NULL) {
-        return NULL;
-    }
-    qatomic_set(&cpu->tb_jmp_cache[hash], tb);
-    return tb;
-}
-
-#endif /* EXEC_TB_LOOKUP_H */
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index fb6668606f..0d92698030 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -40,7 +40,6 @@
 #include "sysemu/replay.h"
 #include "exec/helper-proto.h"
 #include "tb-hash.h"
-#include "tb-lookup.h"
 #include "tb-context.h"
 #include "internal.h"
 
@@ -146,6 +145,36 @@ static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
 }
 #endif /* CONFIG USER ONLY */
 
+/* Might cause an exception, so have a longjmp destination ready */
+static inline TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
+                                          target_ulong cs_base,
+                                          uint32_t flags, uint32_t cflags)
+{
+    TranslationBlock *tb;
+    uint32_t hash;
+
+    /* we should never be trying to look up an INVALID tb */
+    tcg_debug_assert(!(cflags & CF_INVALID));
+
+    hash = tb_jmp_cache_hash_func(pc);
+    tb = qatomic_rcu_read(&cpu->tb_jmp_cache[hash]);
+
+    if (likely(tb &&
+               tb->pc == pc &&
+               tb->cs_base == cs_base &&
+               tb->flags == flags &&
+               tb->trace_vcpu_dstate == *cpu->trace_dstate &&
+               tb_cflags(tb) == cflags)) {
+        return tb;
+    }
+    tb = tb_htable_lookup(cpu, pc, cs_base, flags, cflags);
+    if (tb == NULL) {
+        return NULL;
+    }
+    qatomic_set(&cpu->tb_jmp_cache[hash], tb);
+    return tb;
+}
+
 /**
  * helper_lookup_tb_ptr: quick check for next tb
  * @env: current cpu state
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 04/17] accel/tcg: Split out log_cpu_exec
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (2 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 03/17] accel/tcg: Move tb_lookup " Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 15:25 ` [PATCH 05/17] accel/tcg: Log tb->cflags with -d exec Richard Henderson
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

Split out CPU_LOG_EXEC and CPU_LOG_TB_CPU logging from
cpu_tb_exec to a new function.  Perform only one pc
range check after a combined mask check.

Use the new function in lookup_tb_ptr.  This enables
CPU_LOG_TB_CPU between indirectly chained tbs.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c | 61 ++++++++++++++++++++++++--------------------
 1 file changed, 34 insertions(+), 27 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 0d92698030..67ed25beb9 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -175,6 +175,36 @@ static inline TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
     return tb;
 }
 
+static inline void log_cpu_exec(target_ulong pc, CPUState *cpu,
+                                const TranslationBlock *tb)
+{
+    if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_CPU | CPU_LOG_EXEC))
+        && qemu_log_in_addr_range(pc)) {
+
+        qemu_log_mask(CPU_LOG_EXEC,
+                      "Trace %d: %p [" TARGET_FMT_lx
+                      "/" TARGET_FMT_lx "/%#x] %s\n",
+                      cpu->cpu_index, tb->tc.ptr, tb->cs_base, pc, tb->flags,
+                      lookup_symbol(pc));
+
+#if defined(DEBUG_DISAS)
+        if (qemu_loglevel_mask(CPU_LOG_TB_CPU)) {
+            FILE *logfile = qemu_log_lock();
+            int flags = 0;
+
+            if (qemu_loglevel_mask(CPU_LOG_TB_FPU)) {
+                flags |= CPU_DUMP_FPU;
+            }
+#if defined(TARGET_I386)
+            flags |= CPU_DUMP_CCOP;
+#endif
+            log_cpu_state(cpu, flags);
+            qemu_log_unlock(logfile);
+        }
+#endif /* DEBUG_DISAS */
+    }
+}
+
 /**
  * helper_lookup_tb_ptr: quick check for next tb
  * @env: current cpu state
@@ -196,11 +226,9 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
     if (tb == NULL) {
         return tcg_code_gen_epilogue;
     }
-    qemu_log_mask_and_addr(CPU_LOG_EXEC, pc,
-                           "Chain %d: %p ["
-                           TARGET_FMT_lx "/" TARGET_FMT_lx "/%#x] %s\n",
-                           cpu->cpu_index, tb->tc.ptr, cs_base, pc, flags,
-                           lookup_symbol(pc));
+
+    log_cpu_exec(pc, cpu, tb);
+
     return tb->tc.ptr;
 }
 
@@ -222,28 +250,7 @@ cpu_tb_exec(CPUState *cpu, TranslationBlock *itb, int *tb_exit)
     TranslationBlock *last_tb;
     const void *tb_ptr = itb->tc.ptr;
 
-    qemu_log_mask_and_addr(CPU_LOG_EXEC, itb->pc,
-                           "Trace %d: %p ["
-                           TARGET_FMT_lx "/" TARGET_FMT_lx "/%#x] %s\n",
-                           cpu->cpu_index, itb->tc.ptr,
-                           itb->cs_base, itb->pc, itb->flags,
-                           lookup_symbol(itb->pc));
-
-#if defined(DEBUG_DISAS)
-    if (qemu_loglevel_mask(CPU_LOG_TB_CPU)
-        && qemu_log_in_addr_range(itb->pc)) {
-        FILE *logfile = qemu_log_lock();
-        int flags = 0;
-        if (qemu_loglevel_mask(CPU_LOG_TB_FPU)) {
-            flags |= CPU_DUMP_FPU;
-        }
-#if defined(TARGET_I386)
-        flags |= CPU_DUMP_CCOP;
-#endif
-        log_cpu_state(cpu, flags);
-        qemu_log_unlock(logfile);
-    }
-#endif /* DEBUG_DISAS */
+    log_cpu_exec(itb->pc, cpu, itb);
 
     qemu_thread_jit_execute();
     ret = tcg_qemu_tb_exec(env, tb_ptr);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 05/17] accel/tcg: Log tb->cflags with -d exec
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (3 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 04/17] accel/tcg: Split out log_cpu_exec Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 16:36   ` Philippe Mathieu-Daudé
  2021-07-01 15:25 ` [PATCH 06/17] tcg: Remove TCG_TARGET_HAS_goto_ptr Richard Henderson
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 67ed25beb9..e22bcb99f7 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -183,9 +183,9 @@ static inline void log_cpu_exec(target_ulong pc, CPUState *cpu,
 
         qemu_log_mask(CPU_LOG_EXEC,
                       "Trace %d: %p [" TARGET_FMT_lx
-                      "/" TARGET_FMT_lx "/%#x] %s\n",
-                      cpu->cpu_index, tb->tc.ptr, tb->cs_base, pc, tb->flags,
-                      lookup_symbol(pc));
+                      "/" TARGET_FMT_lx "/%08x/%08x] %s\n",
+                      cpu->cpu_index, tb->tc.ptr, tb->cs_base, pc,
+                      tb->flags, tb->cflags, lookup_symbol(pc));
 
 #if defined(DEBUG_DISAS)
         if (qemu_loglevel_mask(CPU_LOG_TB_CPU)) {
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 06/17] tcg: Remove TCG_TARGET_HAS_goto_ptr
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (4 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 05/17] accel/tcg: Log tb->cflags with -d exec Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 16:37   ` Philippe Mathieu-Daudé
  2021-07-01 15:25 ` [PATCH 07/17] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

Since 6eea04347eb6, all tcg backends support goto_ptr.
Remove the conditional, making support mandatory.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/tcg/tcg-opc.h    | 3 +--
 tcg/aarch64/tcg-target.h | 1 -
 tcg/arm/tcg-target.h     | 1 -
 tcg/i386/tcg-target.h    | 1 -
 tcg/mips/tcg-target.h    | 1 -
 tcg/ppc/tcg-target.h     | 1 -
 tcg/riscv/tcg-target.h   | 1 -
 tcg/s390/tcg-target.h    | 1 -
 tcg/sparc/tcg-target.h   | 1 -
 tcg/tci/tcg-target.h     | 1 -
 tcg/tcg-op.c             | 2 +-
 tcg/tcg.c                | 8 ++------
 12 files changed, 4 insertions(+), 18 deletions(-)

diff --git a/include/tcg/tcg-opc.h b/include/tcg/tcg-opc.h
index 993992373e..675873e200 100644
--- a/include/tcg/tcg-opc.h
+++ b/include/tcg/tcg-opc.h
@@ -194,8 +194,7 @@ DEF(insn_start, 0, 0, TLADDR_ARGS * TARGET_INSN_START_WORDS,
     TCG_OPF_NOT_PRESENT)
 DEF(exit_tb, 0, 0, 1, TCG_OPF_BB_EXIT | TCG_OPF_BB_END)
 DEF(goto_tb, 0, 0, 1, TCG_OPF_BB_EXIT | TCG_OPF_BB_END)
-DEF(goto_ptr, 0, 1, 0,
-    TCG_OPF_BB_EXIT | TCG_OPF_BB_END | IMPL(TCG_TARGET_HAS_goto_ptr))
+DEF(goto_ptr, 0, 1, 0, TCG_OPF_BB_EXIT | TCG_OPF_BB_END)
 
 DEF(plugin_cb_start, 0, 0, 3, TCG_OPF_NOT_PRESENT)
 DEF(plugin_cb_end, 0, 0, 0, TCG_OPF_NOT_PRESENT)
diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index 551baf8da3..7a93ac8023 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -88,7 +88,6 @@ typedef enum {
 #define TCG_TARGET_HAS_mulsh_i32        0
 #define TCG_TARGET_HAS_extrl_i64_i32    0
 #define TCG_TARGET_HAS_extrh_i64_i32    0
-#define TCG_TARGET_HAS_goto_ptr         1
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
 #define TCG_TARGET_HAS_div_i64          1
diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h
index 95fcef33bc..d113b7f8db 100644
--- a/tcg/arm/tcg-target.h
+++ b/tcg/arm/tcg-target.h
@@ -148,7 +148,6 @@ extern bool use_neon_instructions;
 #define TCG_TARGET_HAS_mulsh_i32        0
 #define TCG_TARGET_HAS_div_i32          use_idiv_instructions
 #define TCG_TARGET_HAS_rem_i32          0
-#define TCG_TARGET_HAS_goto_ptr         1
 #define TCG_TARGET_HAS_direct_jump      0
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index ac10066c3e..b00a6da293 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386/tcg-target.h
@@ -135,7 +135,6 @@ extern bool have_movbe;
 #define TCG_TARGET_HAS_muls2_i32        1
 #define TCG_TARGET_HAS_muluh_i32        0
 #define TCG_TARGET_HAS_mulsh_i32        0
-#define TCG_TARGET_HAS_goto_ptr         1
 #define TCG_TARGET_HAS_direct_jump      1
 
 #if TCG_TARGET_REG_BITS == 64
diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h
index e81e824cab..3a62055f04 100644
--- a/tcg/mips/tcg-target.h
+++ b/tcg/mips/tcg-target.h
@@ -136,7 +136,6 @@ extern bool use_mips32r2_instructions;
 #define TCG_TARGET_HAS_muluh_i32        1
 #define TCG_TARGET_HAS_mulsh_i32        1
 #define TCG_TARGET_HAS_bswap32_i32      1
-#define TCG_TARGET_HAS_goto_ptr         1
 #define TCG_TARGET_HAS_direct_jump      1
 
 #if TCG_TARGET_REG_BITS == 64
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index c13ed5640a..0943192cde 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -108,7 +108,6 @@ extern bool have_vsx;
 #define TCG_TARGET_HAS_muls2_i32        0
 #define TCG_TARGET_HAS_muluh_i32        1
 #define TCG_TARGET_HAS_mulsh_i32        1
-#define TCG_TARGET_HAS_goto_ptr         1
 #define TCG_TARGET_HAS_direct_jump      1
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index 87ea94666b..ef78b99e98 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -85,7 +85,6 @@ typedef enum {
 #define TCG_TARGET_CALL_STACK_OFFSET    0
 
 /* optional instructions */
-#define TCG_TARGET_HAS_goto_ptr         1
 #define TCG_TARGET_HAS_movcond_i32      0
 #define TCG_TARGET_HAS_div_i32          1
 #define TCG_TARGET_HAS_rem_i32          1
diff --git a/tcg/s390/tcg-target.h b/tcg/s390/tcg-target.h
index b04b72b7eb..2e4ede2ea2 100644
--- a/tcg/s390/tcg-target.h
+++ b/tcg/s390/tcg-target.h
@@ -98,7 +98,6 @@ extern uint64_t s390_facilities;
 #define TCG_TARGET_HAS_mulsh_i32      0
 #define TCG_TARGET_HAS_extrl_i64_i32  0
 #define TCG_TARGET_HAS_extrh_i64_i32  0
-#define TCG_TARGET_HAS_goto_ptr       1
 #define TCG_TARGET_HAS_direct_jump    (s390_facilities & FACILITY_GEN_INST_EXT)
 #define TCG_TARGET_HAS_qemu_st8_i32   0
 
diff --git a/tcg/sparc/tcg-target.h b/tcg/sparc/tcg-target.h
index 86bb9a2d39..c050763049 100644
--- a/tcg/sparc/tcg-target.h
+++ b/tcg/sparc/tcg-target.h
@@ -121,7 +121,6 @@ extern bool use_vis3_instructions;
 #define TCG_TARGET_HAS_muls2_i32        1
 #define TCG_TARGET_HAS_muluh_i32        0
 #define TCG_TARGET_HAS_mulsh_i32        0
-#define TCG_TARGET_HAS_goto_ptr         1
 #define TCG_TARGET_HAS_direct_jump      1
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h
index 7b6089f304..033e613f24 100644
--- a/tcg/tci/tcg-target.h
+++ b/tcg/tci/tcg-target.h
@@ -87,7 +87,6 @@
 #define TCG_TARGET_HAS_muls2_i32        1
 #define TCG_TARGET_HAS_muluh_i32        0
 #define TCG_TARGET_HAS_mulsh_i32        0
-#define TCG_TARGET_HAS_goto_ptr         1
 #define TCG_TARGET_HAS_direct_jump      0
 #define TCG_TARGET_HAS_qemu_st8_i32     0
 
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 44d711c0fc..3d5db9a33c 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2754,7 +2754,7 @@ void tcg_gen_goto_tb(unsigned idx)
 
 void tcg_gen_lookup_and_goto_ptr(void)
 {
-    if (TCG_TARGET_HAS_goto_ptr && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
+    if (!qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
         TCGv_ptr ptr;
 
         plugin_gen_disable_mem_helpers();
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 5150ed700e..cba598c9d3 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -791,9 +791,7 @@ void tcg_prologue_init(TCGContext *s)
      * For tci, we use NULL as the signal to return from the interpreter,
      * so skip this check.
      */
-    if (TCG_TARGET_HAS_goto_ptr) {
-        tcg_debug_assert(tcg_code_gen_epilogue != NULL);
-    }
+    tcg_debug_assert(tcg_code_gen_epilogue != NULL);
 #endif
 }
 
@@ -1176,6 +1174,7 @@ bool tcg_op_supported(TCGOpcode op)
     case INDEX_op_insn_start:
     case INDEX_op_exit_tb:
     case INDEX_op_goto_tb:
+    case INDEX_op_goto_ptr:
     case INDEX_op_qemu_ld_i32:
     case INDEX_op_qemu_st_i32:
     case INDEX_op_qemu_ld_i64:
@@ -1185,9 +1184,6 @@ bool tcg_op_supported(TCGOpcode op)
     case INDEX_op_qemu_st8_i32:
         return TCG_TARGET_HAS_qemu_st8_i32;
 
-    case INDEX_op_goto_ptr:
-        return TCG_TARGET_HAS_goto_ptr;
-
     case INDEX_op_mov_i32:
     case INDEX_op_setcond_i32:
     case INDEX_op_brcond_i32:
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 07/17] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (5 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 06/17] tcg: Remove TCG_TARGET_HAS_goto_ptr Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 15:25 ` [PATCH 08/17] accel/tcg: Move curr_cflags into cpu-exec.c Richard Henderson
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

The space reserved for CF_COUNT_MASK was overly large.
Reduce to free up cflags bits and eliminate an extra test.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h   | 4 +++-
 accel/tcg/translate-all.c | 5 ++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 754f4130c9..dfe82ed19c 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -492,7 +492,9 @@ struct TranslationBlock {
     target_ulong cs_base; /* CS base for this block */
     uint32_t flags; /* flags defining in which context the code was generated */
     uint32_t cflags;    /* compile flags */
-#define CF_COUNT_MASK  0x00007fff
+
+/* Note that TCG_MAX_INSNS is 512; we validate this match elsewhere. */
+#define CF_COUNT_MASK  0x000001ff
 #define CF_LAST_IO     0x00008000 /* Last insn may be an IO access.  */
 #define CF_MEMI_ONLY   0x00010000 /* Only instrument memory ops */
 #define CF_USE_ICOUNT  0x00020000
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 7929a7e320..d0177d772d 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1435,9 +1435,8 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
     }
-    if (max_insns > TCG_MAX_INSNS) {
-        max_insns = TCG_MAX_INSNS;
-    }
+    QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
+
     if (cpu->singlestep_enabled || singlestep) {
         max_insns = 1;
     }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 08/17] accel/tcg: Move curr_cflags into cpu-exec.c
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (6 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 07/17] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 15:25 ` [PATCH 09/17] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR Richard Henderson
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

We will shortly have more than a simple member read here,
with stuff not necessarily exposed to exec/exec-all.h.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h | 5 +----
 accel/tcg/cpu-exec.c    | 5 +++++
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index dfe82ed19c..ae7603ca75 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -565,10 +565,7 @@ static inline uint32_t tb_cflags(const TranslationBlock *tb)
 }
 
 /* current cflags for hashing/comparison */
-static inline uint32_t curr_cflags(CPUState *cpu)
-{
-    return cpu->tcg_cflags;
-}
+uint32_t curr_cflags(CPUState *cpu);
 
 /* TranslationBlock invalidate API */
 #if defined(CONFIG_USER_ONLY)
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index e22bcb99f7..ef4214d893 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -145,6 +145,11 @@ static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
 }
 #endif /* CONFIG USER ONLY */
 
+uint32_t curr_cflags(CPUState *cpu)
+{
+    return cpu->tcg_cflags;
+}
+
 /* Might cause an exception, so have a longjmp destination ready */
 static inline TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
                                           target_ulong cs_base,
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 09/17] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (7 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 08/17] accel/tcg: Move curr_cflags into cpu-exec.c Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 15:25 ` [PATCH 10/17] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain Richard Henderson
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

Move the -d nochain check to bits on tb->cflags.
These will be used for more than -d nochain shortly.

Set bits during curr_cflags, test them in translator_use_goto_tb,
assert we're not doing anything odd in tcg_gen_goto_tb.  The test
in tcg_gen_exit_tb is redundant with the assert for goto_tb_issue_mask.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h | 16 +++++++++-------
 accel/tcg/cpu-exec.c    |  8 +++++++-
 accel/tcg/translator.c  |  5 +++++
 tcg/tcg-op.c            | 28 ++++++++++++----------------
 4 files changed, 33 insertions(+), 24 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index ae7603ca75..6873cce8df 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -494,13 +494,15 @@ struct TranslationBlock {
     uint32_t cflags;    /* compile flags */
 
 /* Note that TCG_MAX_INSNS is 512; we validate this match elsewhere. */
-#define CF_COUNT_MASK  0x000001ff
-#define CF_LAST_IO     0x00008000 /* Last insn may be an IO access.  */
-#define CF_MEMI_ONLY   0x00010000 /* Only instrument memory ops */
-#define CF_USE_ICOUNT  0x00020000
-#define CF_INVALID     0x00040000 /* TB is stale. Set with @jmp_lock held */
-#define CF_PARALLEL    0x00080000 /* Generate code for a parallel context */
-#define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */
+#define CF_COUNT_MASK    0x000001ff
+#define CF_NO_GOTO_TB    0x00000200 /* Do not chain with goto_tb */
+#define CF_NO_GOTO_PTR   0x00000400 /* Do not chain with goto_ptr */
+#define CF_LAST_IO       0x00008000 /* Last insn may be an IO access.  */
+#define CF_MEMI_ONLY     0x00010000 /* Only instrument memory ops */
+#define CF_USE_ICOUNT    0x00020000
+#define CF_INVALID       0x00040000 /* TB is stale. Set with @jmp_lock held */
+#define CF_PARALLEL      0x00080000 /* Generate code for a parallel context */
+#define CF_CLUSTER_MASK  0xff000000 /* Top 8 bits are cluster ID */
 #define CF_CLUSTER_SHIFT 24
 
     /* Per-vCPU dynamic tracing state used to generate this TB */
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index ef4214d893..d3232d5764 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -147,7 +147,13 @@ static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
 
 uint32_t curr_cflags(CPUState *cpu)
 {
-    return cpu->tcg_cflags;
+    uint32_t cflags = cpu->tcg_cflags;
+
+    if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
+        cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
+    }
+
+    return cflags;
 }
 
 /* Might cause an exception, so have a longjmp destination ready */
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 59804af37b..2ea5a74f30 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -33,6 +33,11 @@ void translator_loop_temp_check(DisasContextBase *db)
 
 bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
 {
+    /* Suppress goto_tb if requested. */
+    if (tb_cflags(db->tb) & CF_NO_GOTO_TB) {
+        return false;
+    }
+
     /* Suppress goto_tb in the case of single-steping.  */
     if (db->singlestep_enabled || singlestep) {
         return false;
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 3d5db9a33c..1c08fc55d8 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2723,10 +2723,6 @@ void tcg_gen_exit_tb(const TranslationBlock *tb, unsigned idx)
            seen this numbered exit before, via tcg_gen_goto_tb.  */
         tcg_debug_assert(tcg_ctx->goto_tb_issue_mask & (1 << idx));
 #endif
-        /* When not chaining, exit without indicating a link.  */
-        if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
-            val = 0;
-        }
     } else {
         /* This is an exit via the exitreq label.  */
         tcg_debug_assert(idx == TB_EXIT_REQUESTED);
@@ -2738,6 +2734,8 @@ void tcg_gen_exit_tb(const TranslationBlock *tb, unsigned idx)
 
 void tcg_gen_goto_tb(unsigned idx)
 {
+    /* We tested CF_NO_GOTO_TB in translator_use_goto_tb. */
+    tcg_debug_assert(!(tcg_ctx->tb_cflags & CF_NO_GOTO_TB));
     /* We only support two chained exits.  */
     tcg_debug_assert(idx <= TB_EXIT_IDXMAX);
 #ifdef CONFIG_DEBUG_TCG
@@ -2746,25 +2744,23 @@ void tcg_gen_goto_tb(unsigned idx)
     tcg_ctx->goto_tb_issue_mask |= 1 << idx;
 #endif
     plugin_gen_disable_mem_helpers();
-    /* When not chaining, we simply fall through to the "fallback" exit.  */
-    if (!qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
-        tcg_gen_op1i(INDEX_op_goto_tb, idx);
-    }
+    tcg_gen_op1i(INDEX_op_goto_tb, idx);
 }
 
 void tcg_gen_lookup_and_goto_ptr(void)
 {
-    if (!qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
-        TCGv_ptr ptr;
+    TCGv_ptr ptr;
 
-        plugin_gen_disable_mem_helpers();
-        ptr = tcg_temp_new_ptr();
-        gen_helper_lookup_tb_ptr(ptr, cpu_env);
-        tcg_gen_op1i(INDEX_op_goto_ptr, tcgv_ptr_arg(ptr));
-        tcg_temp_free_ptr(ptr);
-    } else {
+    if (tcg_ctx->tb_cflags & CF_NO_GOTO_PTR) {
         tcg_gen_exit_tb(NULL, 0);
+        return;
     }
+
+    plugin_gen_disable_mem_helpers();
+    ptr = tcg_temp_new_ptr();
+    gen_helper_lookup_tb_ptr(ptr, cpu_env);
+    tcg_gen_op1i(INDEX_op_goto_ptr, tcgv_ptr_arg(ptr));
+    tcg_temp_free_ptr(ptr);
 }
 
 static inline MemOp tcg_canonicalize_memop(MemOp op, bool is64, bool st)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 10/17] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (8 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 09/17] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 15:25 ` [PATCH 11/17] accel/tcg: Handle -singlestep in curr_cflags Richard Henderson
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

The purpose of suppressing goto_ptr from -d nochain had been
to return to the main loop so that -d cpu would be recognized.
But we now include -d cpu logging in helper_lookup_tb_ptr so
there is no need to exclude goto_ptr.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index d3232d5764..70ea3c7d68 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -150,7 +150,7 @@ uint32_t curr_cflags(CPUState *cpu)
     uint32_t cflags = cpu->tcg_cflags;
 
     if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
-        cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
+        cflags |= CF_NO_GOTO_TB;
     }
 
     return cflags;
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 11/17] accel/tcg: Handle -singlestep in curr_cflags
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (9 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 10/17] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 15:25 ` [PATCH 12/17] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic Richard Henderson
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

Exchange the test in translator_use_goto_tb for CF_NO_GOTO_TB,
and the test in tb_gen_code for setting CF_COUNT_MASK to 1.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c      | 8 +++++++-
 accel/tcg/translate-all.c | 2 +-
 accel/tcg/translator.c    | 2 +-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 70ea3c7d68..2206c463f5 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -149,7 +149,13 @@ uint32_t curr_cflags(CPUState *cpu)
 {
     uint32_t cflags = cpu->tcg_cflags;
 
-    if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
+    /*
+     * For singlestep and -d nochain, suppress goto_tb so that
+     * we can log -d cpu,exec after every TB.
+     */
+    if (singlestep) {
+        cflags |= CF_NO_GOTO_TB | 1;
+    } else if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
         cflags |= CF_NO_GOTO_TB;
     }
 
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index d0177d772d..586f8ca4ef 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1437,7 +1437,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     }
     QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
 
-    if (cpu->singlestep_enabled || singlestep) {
+    if (cpu->singlestep_enabled) {
         max_insns = 1;
     }
 
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 2ea5a74f30..a59eb7c11b 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -39,7 +39,7 @@ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
     }
 
     /* Suppress goto_tb in the case of single-steping.  */
-    if (db->singlestep_enabled || singlestep) {
+    if (db->singlestep_enabled) {
         return false;
     }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 12/17] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (10 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 11/17] accel/tcg: Handle -singlestep in curr_cflags Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 15:25 ` [PATCH 13/17] accel/tcg: Move cflags lookup into tb_find Richard Henderson
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

Request that the one TB returns immediately, so that
we release the exclusive lock as soon as possible.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 2206c463f5..5bb099174f 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -330,8 +330,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
     CPUArchState *env = (CPUArchState *)cpu->env_ptr;
     TranslationBlock *tb;
     target_ulong cs_base, pc;
-    uint32_t flags;
-    uint32_t cflags = (curr_cflags(cpu) & ~CF_PARALLEL) | 1;
+    uint32_t flags, cflags;
     int tb_exit;
 
     if (sigsetjmp(cpu->jmp_env, 0) == 0) {
@@ -341,8 +340,14 @@ void cpu_exec_step_atomic(CPUState *cpu)
         cpu->running = true;
 
         cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
-        tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
 
+        cflags = curr_cflags(cpu);
+        /* Execute in a serial context. */
+        cflags &= ~CF_PARALLEL;
+        /* After 1 insn, return and release the exclusive lock. */
+        cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR | 1;
+
+        tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
         if (tb == NULL) {
             mmap_lock();
             tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 13/17] accel/tcg: Move cflags lookup into tb_find
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (11 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 12/17] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 15:25 ` [PATCH 14/17] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check Richard Henderson
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

We will shortly require the guest pc for computing cflags,
so move the choice just after cpu_get_tb_cpu_state.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cpu-exec.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 5bb099174f..4d043a11aa 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -502,15 +502,29 @@ static inline void tb_add_jump(TranslationBlock *tb, int n,
 
 static inline TranslationBlock *tb_find(CPUState *cpu,
                                         TranslationBlock *last_tb,
-                                        int tb_exit, uint32_t cflags)
+                                        int tb_exit)
 {
     CPUArchState *env = (CPUArchState *)cpu->env_ptr;
     TranslationBlock *tb;
     target_ulong cs_base, pc;
-    uint32_t flags;
+    uint32_t flags, cflags;
 
     cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
 
+    /*
+     * When requested, use an exact setting for cflags for the next
+     * execution.  This is used for icount, precise smc, and stop-
+     * after-access watchpoints.  Since this request should never
+     * have CF_INVALID set, -1 is a convenient invalid value that
+     * does not require tcg headers for cpu_common_reset.
+     */
+    cflags = cpu->cflags_next_tb;
+    if (cflags == -1) {
+        cflags = curr_cflags(cpu);
+    } else {
+        cpu->cflags_next_tb = -1;
+    }
+
     tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
     if (tb == NULL) {
         mmap_lock();
@@ -868,21 +882,7 @@ int cpu_exec(CPUState *cpu)
         int tb_exit = 0;
 
         while (!cpu_handle_interrupt(cpu, &last_tb)) {
-            uint32_t cflags = cpu->cflags_next_tb;
-            TranslationBlock *tb;
-
-            /* When requested, use an exact setting for cflags for the next
-               execution.  This is used for icount, precise smc, and stop-
-               after-access watchpoints.  Since this request should never
-               have CF_INVALID set, -1 is a convenient invalid value that
-               does not require tcg headers for cpu_common_reset.  */
-            if (cflags == -1) {
-                cflags = curr_cflags(cpu);
-            } else {
-                cpu->cflags_next_tb = -1;
-            }
-
-            tb = tb_find(cpu, last_tb, tb_exit, cflags);
+            TranslationBlock *tb = tb_find(cpu, last_tb, tb_exit);
             cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit);
             /* Try to align the host and virtual clocks
                if the guest is in advance */
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 14/17] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (12 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 13/17] accel/tcg: Move cflags lookup into tb_find Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 15:25 ` [PATCH 15/17] accel/tcg: Hoist tb_cflags to a local in translator_loop Richard Henderson
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

We don't need the whole CPUBreakpoint structure in the check,
only the flags.  Return the instruction length to consolidate
the adjustment of db->pc_next.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/translator.h     | 17 +++++++++------
 accel/tcg/translator.c        | 40 ++++++++++++++++++++++++-----------
 target/alpha/translate.c      | 12 +++--------
 target/arm/translate-a64.c    | 14 ++++--------
 target/arm/translate.c        | 20 +++++++-----------
 target/avr/translate.c        |  6 +++---
 target/cris/translate.c       | 14 ++++--------
 target/hexagon/translate.c    | 13 +++---------
 target/hppa/translate.c       |  7 +++---
 target/i386/tcg/translate.c   | 15 ++++---------
 target/m68k/translate.c       | 14 +++---------
 target/microblaze/translate.c | 14 +++---------
 target/mips/tcg/translate.c   | 14 ++++--------
 target/nios2/translate.c      | 13 +++---------
 target/openrisc/translate.c   | 11 +++-------
 target/ppc/translate.c        | 13 +++---------
 target/riscv/translate.c      | 11 +++-------
 target/rx/translate.c         |  8 +++----
 target/s390x/translate.c      | 12 ++++-------
 target/sh4/translate.c        | 12 ++++-------
 target/sparc/translate.c      |  9 ++++----
 target/tricore/translate.c    | 13 +++---------
 target/xtensa/translate.c     | 12 ++++-------
 23 files changed, 115 insertions(+), 199 deletions(-)

diff --git a/include/exec/translator.h b/include/exec/translator.h
index dd9c06d40d..433b753c5c 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -92,11 +92,15 @@ typedef struct DisasContextBase {
  * @breakpoint_check:
  *      When called, the breakpoint has already been checked to match the PC,
  *      but the target may decide the breakpoint missed the address
- *      (e.g., due to conditions encoded in their flags).  Return true to
- *      indicate that the breakpoint did hit, in which case no more breakpoints
- *      are checked.  If the breakpoint did hit, emit any code required to
- *      signal the exception, and set db->is_jmp as necessary to terminate
- *      the main loop.
+ *      (e.g., due to conditions encoded in their flags), in which case
+ *      db->is_jmp may be left as DISAS_NEXT or DISAS_TOO_MANY to indicate
+ *      that the insn should be translated.  Anything other than those two
+ *      will be taken to indicate an exception has been raised, but in most
+ *      cases db->is_jmp should be set to DISAS_NORETURN.
+ *
+ *      Return the minimum instruction size that should be applied to the TB.
+ *      The size of any TB cannot be zero, as that breaks the math used to
+ *      invalidate TBs.
  *
  * @translate_insn:
  *      Disassemble one instruction and set db->pc_next for the start
@@ -113,8 +117,7 @@ typedef struct TranslatorOps {
     void (*init_disas_context)(DisasContextBase *db, CPUState *cpu);
     void (*tb_start)(DisasContextBase *db, CPUState *cpu);
     void (*insn_start)(DisasContextBase *db, CPUState *cpu);
-    bool (*breakpoint_check)(DisasContextBase *db, CPUState *cpu,
-                             const CPUBreakpoint *bp);
+    int (*breakpoint_check)(DisasContextBase *db, CPUState *cpu, int flags);
     void (*translate_insn)(DisasContextBase *db, CPUState *cpu);
     void (*tb_stop)(DisasContextBase *db, CPUState *cpu);
     void (*disas_log)(const DisasContextBase *db, CPUState *cpu);
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index a59eb7c11b..1c44d096d8 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -50,7 +50,6 @@ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
 void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
                      CPUState *cpu, TranslationBlock *tb, int max_insns)
 {
-    int bp_insn = 0;
     bool plugin_enabled;
 
     /* Initialize DisasContext */
@@ -91,19 +90,35 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
             CPUBreakpoint *bp;
             QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) {
                 if (bp->pc == db->pc_next) {
-                    if (ops->breakpoint_check(db, cpu, bp)) {
-                        bp_insn = 1;
-                        break;
+                    int len = ops->breakpoint_check(db, cpu, bp->flags);
+
+                    /*
+                     * The breakpoint_check hook may use DISAS_TOO_MANY
+                     * to indicate that only one more instruction is to
+                     * be executed.  Otherwise it should use DISAS_NORETURN
+                     * when generating an exception, but may use a
+                     * DISAS_TARGET_* value for Something Else.
+                     */
+                    if (db->is_jmp > DISAS_TOO_MANY) {
+                        /*
+                         * The address covered by the breakpoint must be
+                         * included in [tb->pc, tb->pc + tb->size) in order
+                         * to for it to be properly cleared.  Thus we
+                         * increment the PC here so that the logic setting
+                         * tb->size below does the right thing.
+                         */
+                        tcg_debug_assert(len > 0);
+                        db->pc_next += len;
+
+                        /*
+                         * The breakpoint definitely hit, so decrement the
+                         * number of instructions completed for icount.
+                         */
+                        db->num_insns--;
+                        goto done;
                     }
                 }
             }
-            /* The breakpoint_check hook may use DISAS_TOO_MANY to indicate
-               that only one more instruction is to be executed.  Otherwise
-               it should use DISAS_NORETURN when generating an exception,
-               but may use a DISAS_TARGET_* value for Something Else.  */
-            if (db->is_jmp > DISAS_TOO_MANY) {
-                break;
-            }
         }
 
         /* Disassemble one instruction.  The translate_insn hook should
@@ -142,9 +157,10 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
         }
     }
 
+ done:
     /* Emit code to exit the TB, as indicated by db->is_jmp.  */
     ops->tb_stop(db, cpu);
-    gen_tb_end(db->tb, db->num_insns - bp_insn);
+    gen_tb_end(db->tb, db->num_insns);
 
     if (plugin_enabled) {
         plugin_gen_tb_end(cpu);
diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index de769f7633..6521bea7af 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -3013,19 +3013,13 @@ static void alpha_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(dcbase->pc_next);
 }
 
-static bool alpha_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                      const CPUBreakpoint *bp)
+static int alpha_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                     int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     ctx->base.is_jmp = gen_excp(ctx, EXCP_DEBUG, 0);
-
-    /* The address covered by the breakpoint must be included in
-       [tb->pc, tb->pc + tb->size) in order to for it to be
-       properly cleared -- thus we increment the PC here so that
-       the logic setting tb->size below does the right thing.  */
-    ctx->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static void alpha_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index eb1907d049..74bbee1360 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -14917,28 +14917,22 @@ static void aarch64_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     dc->insn_start = tcg_last_op();
 }
 
-static bool aarch64_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                        const CPUBreakpoint *bp)
+static int aarch64_tr_breakpoint_check(DisasContextBase *dcbase,
+                                       CPUState *cpu, int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
-    if (bp->flags & BP_CPU) {
+    if (bp_flags & BP_CPU) {
         gen_a64_set_pc_im(dc->base.pc_next);
         gen_helper_check_breakpoints(cpu_env);
         /* End the TB early; it likely won't be executed */
         dc->base.is_jmp = DISAS_TOO_MANY;
     } else {
         gen_exception_internal_insn(dc, dc->base.pc_next, EXCP_DEBUG);
-        /* The address covered by the breakpoint must be
-           included in [tb->pc, tb->pc + tb->size) in order
-           to for it to be properly cleared -- thus we
-           increment the PC here so that the logic setting
-           tb->size below does the right thing.  */
-        dc->base.pc_next += 4;
         dc->base.is_jmp = DISAS_NORETURN;
     }
 
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 87c3c09df5..452b5a8168 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9116,12 +9116,12 @@ static void arm_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     dc->insn_start = tcg_last_op();
 }
 
-static bool arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                    const CPUBreakpoint *bp)
+static int arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                   int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
-    if (bp->flags & BP_CPU) {
+    if (bp_flags & BP_CPU) {
         gen_set_condexec(dc);
         gen_set_pc_im(dc, dc->base.pc_next);
         gen_helper_check_breakpoints(cpu_env);
@@ -9129,18 +9129,14 @@ static bool arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
         dc->base.is_jmp = DISAS_TOO_MANY;
     } else {
         gen_exception_internal_insn(dc, dc->base.pc_next, EXCP_DEBUG);
-        /* The address covered by the breakpoint must be
-           included in [tb->pc, tb->pc + tb->size) in order
-           to for it to be properly cleared -- thus we
-           increment the PC here so that the logic setting
-           tb->size below does the right thing.  */
-        /* TODO: Advance PC by correct instruction length to
-         * avoid disassembler error messages */
-        dc->base.pc_next += 2;
         dc->base.is_jmp = DISAS_NORETURN;
     }
 
-    return true;
+    /*
+     * TODO: Advance PC by correct instruction length to avoid disassembler
+     * error messages.  In the meantime, minimum instruction length.
+     */
+    return 2;
 }
 
 static bool arm_pre_translate_insn(DisasContext *dc)
diff --git a/target/avr/translate.c b/target/avr/translate.c
index 8237a03c23..73ff467926 100644
--- a/target/avr/translate.c
+++ b/target/avr/translate.c
@@ -2944,13 +2944,13 @@ static void avr_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     tcg_gen_insn_start(ctx->npc);
 }
 
-static bool avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                    const CPUBreakpoint *bp)
+static int avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                   int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     gen_breakpoint(ctx);
-    return true;
+    return 2; /* minimum instruction length */
 }
 
 static void avr_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/cris/translate.c b/target/cris/translate.c
index e33a3bb326..97710ef4a6 100644
--- a/target/cris/translate.c
+++ b/target/cris/translate.c
@@ -3119,8 +3119,8 @@ static void cris_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(dc->delayed_branch == 1 ? dc->ppc | 1 : dc->pc);
 }
 
-static bool cris_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                     const CPUBreakpoint *bp)
+static int cris_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                    int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
@@ -3128,14 +3128,8 @@ static bool cris_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
     tcg_gen_movi_tl(env_pc, dc->pc);
     t_gen_raise_exception(EXCP_DEBUG);
     dc->base.is_jmp = DISAS_NORETURN;
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    dc->pc += 2;
-    return true;
+
+    return 2; /* minimum instruction length */
 }
 
 static void cris_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 9a37644182..ce3569fa7c 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -547,20 +547,13 @@ static void hexagon_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(ctx->base.pc_next);
 }
 
-static bool hexagon_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                        const CPUBreakpoint *bp)
+static int hexagon_tr_breakpoint_check(DisasContextBase *dcbase,
+                                       CPUState *cpu, int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     gen_exception_end_tb(ctx, EXCP_DEBUG);
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    ctx->base.pc_next += 4;
-    return true;
+    return 4; /* minimum packet length */
 }
 
 static bool pkt_crosses_page(CPUHexagonState *env, DisasContext *ctx)
diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 952cfe09a6..1c7bb2a413 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -4205,14 +4205,13 @@ static void hppa_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     tcg_gen_insn_start(ctx->iaoq_f, ctx->iaoq_b);
 }
 
-static bool hppa_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                      const CPUBreakpoint *bp)
+static int hppa_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                    int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     gen_excp(ctx, EXCP_DEBUG);
-    ctx->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static void hppa_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index eb9ee0296f..08fc90518b 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -8583,23 +8583,16 @@ static void i386_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(dc->base.pc_next, dc->cc_op);
 }
 
-static bool i386_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                     const CPUBreakpoint *bp)
+static int i386_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                    int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
     /* If RF is set, suppress an internally generated breakpoint.  */
     int flags = dc->base.tb->flags & HF_RF_MASK ? BP_GDB : BP_ANY;
-    if (bp->flags & flags) {
+    if (bp_flags & flags) {
         gen_debug(dc);
-        /* The address covered by the breakpoint must be included in
-           [tb->pc, tb->pc + tb->size) in order to for it to be
-           properly cleared -- thus we increment the PC here so that
-           the generic logic setting tb->size later does the right thing.  */
-        dc->base.pc_next += 1;
-        return true;
-    } else {
-        return false;
     }
+    return 1; /* minimum instruction length */
 }
 
 static void i386_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 05b96fdda7..2c921d4260 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -6209,21 +6209,13 @@ static void m68k_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(dc->base.pc_next, dc->cc_op);
 }
 
-static bool m68k_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                     const CPUBreakpoint *bp)
+static int m68k_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                    int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
     gen_exception(dc, dc->base.pc_next, EXCP_DEBUG);
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    dc->base.pc_next += 2;
-
-    return true;
+    return 2; /* minimum instruction length */
 }
 
 static void m68k_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/microblaze/translate.c b/target/microblaze/translate.c
index b753f080e7..b0c355a3eb 100644
--- a/target/microblaze/translate.c
+++ b/target/microblaze/translate.c
@@ -1674,21 +1674,13 @@ static void mb_tr_insn_start(DisasContextBase *dcb, CPUState *cs)
     dc->insn_start = tcg_last_op();
 }
 
-static bool mb_tr_breakpoint_check(DisasContextBase *dcb, CPUState *cs,
-                                   const CPUBreakpoint *bp)
+static int mb_tr_breakpoint_check(DisasContextBase *dcb, CPUState *cs,
+                                  int bp_flags)
 {
     DisasContext *dc = container_of(dcb, DisasContext, base);
 
     gen_raise_exception_sync(dc, EXCP_DEBUG);
-
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    dc->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static void mb_tr_translate_insn(DisasContextBase *dcb, CPUState *cs)
diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
index 17e79f3de3..441c36fad9 100644
--- a/target/mips/tcg/translate.c
+++ b/target/mips/tcg/translate.c
@@ -25442,22 +25442,16 @@ static void mips_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
                        ctx->btarget);
 }
 
-static bool mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                     const CPUBreakpoint *bp)
+static int mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                    int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     save_cpu_state(ctx, 1);
     ctx->base.is_jmp = DISAS_NORETURN;
     gen_helper_raise_exception_debug(cpu_env);
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    ctx->base.pc_next += 4;
-    return true;
+
+    return 2; /* minimum instruction length */
 }
 
 static void mips_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index 17742cebc7..1d1c66b88f 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -777,20 +777,13 @@ static void nios2_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     tcg_gen_insn_start(dcbase->pc_next);
 }
 
-static bool nios2_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                      const CPUBreakpoint *bp)
+static int nios2_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                     int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
     gen_exception(dc, EXCP_DEBUG);
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    dc->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static void nios2_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index 2d142d8577..1da064edb4 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -1640,20 +1640,15 @@ static void openrisc_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
                        | (dc->base.num_insns > 1 ? 2 : 0));
 }
 
-static bool openrisc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                         const CPUBreakpoint *bp)
+static int openrisc_tr_breakpoint_check(DisasContextBase *dcbase,
+                                        CPUState *cs, int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
     tcg_gen_movi_tl(cpu_pc, dc->base.pc_next);
     gen_exception(dc, EXCP_DEBUG);
     dc->base.is_jmp = DISAS_NORETURN;
-    /* The address covered by the breakpoint must be included in
-       [tb->pc, tb->pc + tb->size) in order to for it to be
-       properly cleared -- thus we increment the PC here so that
-       the logic setting tb->size below does the right thing.  */
-    dc->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static void openrisc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 0fb09f2301..ffefb5e78b 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -8561,21 +8561,14 @@ static void ppc_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     tcg_gen_insn_start(dcbase->pc_next);
 }
 
-static bool ppc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                    const CPUBreakpoint *bp)
+static int ppc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                   int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     gen_update_nip(ctx, ctx->base.pc_next);
     gen_debug_exception(ctx);
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be properly
-     * cleared -- thus we increment the PC here so that the logic
-     * setting tb->size below does the right thing.
-     */
-    ctx->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static bool is_prefix_insn(DisasContext *ctx, uint32_t insn)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index deda0c8a44..8a6bc58572 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -961,20 +961,15 @@ static void riscv_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(ctx->base.pc_next);
 }
 
-static bool riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                      const CPUBreakpoint *bp)
+static int riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                     int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
     tcg_gen_movi_tl(cpu_pc, ctx->base.pc_next);
     ctx->base.is_jmp = DISAS_NORETURN;
     gen_exception_debug();
-    /* The address covered by the breakpoint must be included in
-       [tb->pc, tb->pc + tb->size) in order to for it to be
-       properly cleared -- thus we increment the PC here so that
-       the logic setting tb->size below does the right thing.  */
-    ctx->base.pc_next += 4;
-    return true;
+    return 2; /* minimum instruction length */
 }
 
 static void riscv_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/rx/translate.c b/target/rx/translate.c
index 2443406de5..b0cd517e0b 100644
--- a/target/rx/translate.c
+++ b/target/rx/translate.c
@@ -2310,8 +2310,8 @@ static void rx_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     tcg_gen_insn_start(ctx->base.pc_next);
 }
 
-static bool rx_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                    const CPUBreakpoint *bp)
+static int rx_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                  int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
@@ -2319,8 +2319,8 @@ static bool rx_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
     tcg_gen_movi_i32(cpu_pc, ctx->base.pc_next);
     gen_helper_debug(cpu_env);
     ctx->base.is_jmp = DISAS_NORETURN;
-    ctx->base.pc_next += 1;
-    return true;
+
+    return 1; /* minimum instruction length */
 }
 
 static void rx_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index 4742f59ca9..4634bb79e6 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -6566,8 +6566,8 @@ static void s390x_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
 {
 }
 
-static bool s390x_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                      const CPUBreakpoint *bp)
+static int s390x_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                     int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
@@ -6581,12 +6581,8 @@ static bool s390x_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
 
     dc->base.is_jmp = DISAS_PC_STALE;
     dc->do_debug = true;
-    /* The address covered by the breakpoint must be included in
-       [tb->pc, tb->pc + tb->size) in order to for it to be
-       properly cleared -- thus we increment the PC here so that
-       the logic setting tb->size does the right thing.  */
-    dc->base.pc_next += 2;
-    return true;
+
+    return 2; /* minimum instruction length */
 }
 
 static void s390x_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index db09a0bce3..3672802785 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -2290,8 +2290,8 @@ static void sh4_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     tcg_gen_insn_start(ctx->base.pc_next, ctx->envflags);
 }
 
-static bool sh4_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                    const CPUBreakpoint *bp)
+static int sh4_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                   int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
 
@@ -2299,12 +2299,8 @@ static bool sh4_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
     gen_save_cpu_state(ctx, true);
     gen_helper_debug(cpu_env);
     ctx->base.is_jmp = DISAS_NORETURN;
-    /* The address covered by the breakpoint must be included in
-       [tb->pc, tb->pc + tb->size) in order to for it to be
-       properly cleared -- thus we increment the PC here so that
-       the logic setting tb->size below does the right thing.  */
-    ctx->base.pc_next += 2;
-    return true;
+
+    return 2; /* minimum instruction length */
 }
 
 static void sh4_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index fb0c242606..fbc77b693d 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -5855,8 +5855,8 @@ static void sparc_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     }
 }
 
-static bool sparc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
-                                      const CPUBreakpoint *bp)
+static int sparc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+                                     int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
@@ -5866,9 +5866,8 @@ static bool sparc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
     gen_helper_debug(cpu_env);
     tcg_gen_exit_tb(NULL, 0);
     dc->base.is_jmp = DISAS_NORETURN;
-    /* update pc_next so that the current instruction is included in tb->size */
-    dc->base.pc_next += 4;
-    return true;
+
+    return 4; /* minimum instruction length */
 }
 
 static void sparc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/tricore/translate.c b/target/tricore/translate.c
index 865020754d..8c39134d52 100644
--- a/target/tricore/translate.c
+++ b/target/tricore/translate.c
@@ -8810,19 +8810,12 @@ static void tricore_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(ctx->base.pc_next);
 }
 
-static bool tricore_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                      const CPUBreakpoint *bp)
+static int tricore_tr_breakpoint_check(DisasContextBase *dcbase,
+                                       CPUState *cpu, int bp_flags)
 {
     DisasContext *ctx = container_of(dcbase, DisasContext, base);
     generate_qemu_excp(ctx, EXCP_DEBUG);
-    /*
-     * The address covered by the breakpoint must be included in
-     * [tb->pc, tb->pc + tb->size) in order to for it to be
-     * properly cleared -- thus we increment the PC here so that
-     * the logic setting tb->size below does the right thing.
-     */
-    ctx->base.pc_next += 4;
-    return true;
+    return 4; /* minimum instruction length */
 }
 
 static bool insn_crosses_page(CPUTriCoreState *env, DisasContext *ctx)
diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index ac42f5efdc..347187d588 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -1233,20 +1233,16 @@ static void xtensa_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
     tcg_gen_insn_start(dcbase->pc_next);
 }
 
-static bool xtensa_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
-                                       const CPUBreakpoint *bp)
+static int xtensa_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+                                      int bp_flags)
 {
     DisasContext *dc = container_of(dcbase, DisasContext, base);
 
     tcg_gen_movi_i32(cpu_pc, dc->base.pc_next);
     gen_exception(dc, EXCP_DEBUG);
     dc->base.is_jmp = DISAS_NORETURN;
-    /* The address covered by the breakpoint must be included in
-       [tb->pc, tb->pc + tb->size) in order to for it to be
-       properly cleared -- thus we increment the PC here so that
-       the logic setting tb->size below does the right thing.  */
-    dc->base.pc_next += 2;
-    return true;
+
+    return 2; /* minimum instruction length */
 }
 
 static void xtensa_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 15/17] accel/tcg: Hoist tb_cflags to a local in translator_loop
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (13 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 14/17] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 15:25 ` [PATCH 16/17] accel/tcg: Encode breakpoint info into tb->cflags Richard Henderson
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

The access internal to tb_cflags() is atomic.
Avoid re-reading it as such for the multiple uses.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/translator.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 1c44d096d8..449159a27c 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -50,6 +50,7 @@ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
 void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
                      CPUState *cpu, TranslationBlock *tb, int max_insns)
 {
+    uint32_t cflags = tb_cflags(tb);
     bool plugin_enabled;
 
     /* Initialize DisasContext */
@@ -72,8 +73,7 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
     ops->tb_start(db, cpu);
     tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
 
-    plugin_enabled = plugin_gen_tb_start(cpu, tb,
-                                         tb_cflags(db->tb) & CF_MEMI_ONLY);
+    plugin_enabled = plugin_gen_tb_start(cpu, tb, cflags & CF_MEMI_ONLY);
 
     while (true) {
         db->num_insns++;
@@ -125,14 +125,13 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
            update db->pc_next and db->is_jmp to indicate what should be
            done next -- either exiting this loop or locate the start of
            the next instruction.  */
-        if (db->num_insns == db->max_insns
-            && (tb_cflags(db->tb) & CF_LAST_IO)) {
+        if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
             /* Accept I/O on the last instruction.  */
             gen_io_start();
             ops->translate_insn(db, cpu);
         } else {
             /* we should only see CF_MEMI_ONLY for io_recompile */
-            tcg_debug_assert(!(tb_cflags(db->tb) & CF_MEMI_ONLY));
+            tcg_debug_assert(!(cflags & CF_MEMI_ONLY));
             ops->translate_insn(db, cpu);
         }
 
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 16/17] accel/tcg: Encode breakpoint info into tb->cflags
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (14 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 15/17] accel/tcg: Hoist tb_cflags to a local in translator_loop Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 15:25 ` [PATCH 17/17] cpu: Add breakpoint tracepoints Richard Henderson
  2021-07-03  7:00 ` [PATCH 00/17] tcg: breakpoint reorg Mark Cave-Ayland
  17 siblings, 0 replies; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

Having this data in cflags means that hashing takes care
of selecting a TB with or without exceptions built in.
Which means that we no longer need to flush all TBs.

This does require that we single-step while we're within a page
that contains a breakpoint, so it's not yet ideal, but should be
an improvement over some corner-case slowdowns.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/404
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/exec/exec-all.h   |  7 ++++
 accel/tcg/cpu-exec.c      | 68 ++++++++++++++++++++++++++++++-
 accel/tcg/translate-all.c |  4 --
 accel/tcg/translator.c    | 85 +++++++++++++++++++++------------------
 cpu.c                     | 24 -----------
 5 files changed, 119 insertions(+), 69 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 6873cce8df..7ab2578f71 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -502,9 +502,16 @@ struct TranslationBlock {
 #define CF_USE_ICOUNT    0x00020000
 #define CF_INVALID       0x00040000 /* TB is stale. Set with @jmp_lock held */
 #define CF_PARALLEL      0x00080000 /* Generate code for a parallel context */
+#define CF_BP_MASK       0x00300000 /* See below */
+#define CF_BP_SHIFT      20
 #define CF_CLUSTER_MASK  0xff000000 /* Top 8 bits are cluster ID */
 #define CF_CLUSTER_SHIFT 24
 
+#define CF_BP_NONE       (0 << CF_BP_SHIFT) /* TB does not interact with BPs */
+#define CF_BP_SSTEP      (1 << CF_BP_SHIFT) /* gdbstub single-step in effect */
+#define CF_BP_GDB        (2 << CF_BP_SHIFT) /* gdbstub breakpoint at tb->pc */
+#define CF_BP_CPU        (3 << CF_BP_SHIFT) /* arch breakpoint at tb->pc */
+
     /* Per-vCPU dynamic tracing state used to generate this TB */
     uint32_t trace_vcpu_dstate;
 
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 4d043a11aa..179a425ece 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -222,6 +222,65 @@ static inline void log_cpu_exec(target_ulong pc, CPUState *cpu,
     }
 }
 
+static uint32_t cflags_for_breakpoints(CPUState *cpu, target_ulong pc,
+                                       uint32_t cflags)
+{
+    uint32_t bflags = 0;
+
+    if (unlikely(cpu->singlestep_enabled)) {
+        bflags = CF_BP_SSTEP;
+    } else {
+        bool match_page = false;
+        CPUBreakpoint *bp;
+
+        QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) {
+            /* Note exact pc matches */
+            if (pc == bp->pc) {
+                if (bp->flags & BP_GDB) {
+                    bflags = CF_BP_GDB;
+                    break;
+                }
+                if (bp->flags & BP_CPU) {
+                    bflags = CF_BP_CPU;
+                    break;
+                }
+            }
+            /* Note page matches */
+            if (((pc ^ bp->pc) & TARGET_PAGE_MASK) == 0) {
+                match_page = true;
+            }
+        }
+
+        if (likely(!bflags)) {
+            if (likely(!match_page)) {
+                return cflags;
+            }
+
+            /*
+             * Within the same page as a breakpoint, single-step,
+             * returning to helper_lookup_tb_ptr after each looking
+             * for the actual breakpoint.
+             *
+             * TODO: Perhaps better to record all of the TBs associated
+             * with a given virtual page that contains a breakpoint, and
+             * then invalidate them when a new overlapping breakpoint is
+             * set on the page.  Non-overlapping TBs would not be
+             * invalidated, nor would any TB need to be invalidated as
+             * breakpoints are removed.
+             */
+            bflags = CF_NO_GOTO_TB;
+        }
+    }
+
+    /*
+     * Reduce the TB to one insn.
+     * In the case of a BP hit, we will be raising an exception anyway.
+     * In the case of a page hit, return to helper_lookup_tb_ptr after
+     * every insn to look for the breakpoint.
+     */
+    return (cflags & ~CF_COUNT_MASK) | 1 | bflags;
+}
+
 /**
  * helper_lookup_tb_ptr: quick check for next tb
  * @env: current cpu state
@@ -235,11 +294,13 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
     CPUState *cpu = env_cpu(env);
     TranslationBlock *tb;
     target_ulong cs_base, pc;
-    uint32_t flags;
+    uint32_t flags, cflags;
 
     cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
 
-    tb = tb_lookup(cpu, pc, cs_base, flags, curr_cflags(cpu));
+    cflags = cflags_for_breakpoints(cpu, pc, curr_cflags(cpu));
+
+    tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
     if (tb == NULL) {
         return tcg_code_gen_epilogue;
     }
@@ -346,6 +407,8 @@ void cpu_exec_step_atomic(CPUState *cpu)
         cflags &= ~CF_PARALLEL;
         /* After 1 insn, return and release the exclusive lock. */
         cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR | 1;
+        /* Raise any post-instruction debug exceptions. */
+        cflags = cflags_for_breakpoints(cpu, pc, cflags);
 
         tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
         if (tb == NULL) {
@@ -524,6 +587,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
     } else {
         cpu->cflags_next_tb = -1;
     }
+    cflags = cflags_for_breakpoints(cpu, pc, cflags);
 
     tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
     if (tb == NULL) {
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 586f8ca4ef..262659780b 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1437,10 +1437,6 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     }
     QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
 
-    if (cpu->singlestep_enabled) {
-        max_insns = 1;
-    }
-
  buffer_overflow:
     tb = tcg_tb_alloc(tcg_ctx);
     if (unlikely(!tb)) {
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 449159a27c..01b48137da 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -33,13 +33,8 @@ void translator_loop_temp_check(DisasContextBase *db)
 
 bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
 {
-    /* Suppress goto_tb if requested. */
-    if (tb_cflags(db->tb) & CF_NO_GOTO_TB) {
-        return false;
-    }
-
-    /* Suppress goto_tb in the case of single-steping.  */
-    if (db->singlestep_enabled) {
+    /* Suppress goto_tb if requested, or required by breakpoints. */
+    if (tb_cflags(db->tb) & (CF_NO_GOTO_TB | CF_BP_MASK)) {
         return false;
     }
 
@@ -51,6 +46,7 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
                      CPUState *cpu, TranslationBlock *tb, int max_insns)
 {
     uint32_t cflags = tb_cflags(tb);
+    int bp_flags = 0;
     bool plugin_enabled;
 
     /* Initialize DisasContext */
@@ -60,7 +56,23 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
     db->is_jmp = DISAS_NEXT;
     db->num_insns = 0;
     db->max_insns = max_insns;
-    db->singlestep_enabled = cpu->singlestep_enabled;
+    db->singlestep_enabled = false;
+
+    switch (cflags & CF_BP_MASK) {
+    case CF_BP_NONE:
+        break;
+    case CF_BP_SSTEP:
+        db->singlestep_enabled = true;
+        break;
+    case CF_BP_GDB:
+        bp_flags = BP_GDB;
+        break;
+    case CF_BP_CPU:
+        bp_flags = BP_CPU;
+        break;
+    default:
+        g_assert_not_reached();
+    }
 
     ops->init_disas_context(db, cpu);
     tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
@@ -85,39 +97,34 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
         }
 
         /* Pass breakpoint hits to target for further processing */
-        if (!db->singlestep_enabled
-            && unlikely(!QTAILQ_EMPTY(&cpu->breakpoints))) {
-            CPUBreakpoint *bp;
-            QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) {
-                if (bp->pc == db->pc_next) {
-                    int len = ops->breakpoint_check(db, cpu, bp->flags);
+        if (unlikely(bp_flags)) {
+            int len = ops->breakpoint_check(db, cpu, bp_flags);
 
-                    /*
-                     * The breakpoint_check hook may use DISAS_TOO_MANY
-                     * to indicate that only one more instruction is to
-                     * be executed.  Otherwise it should use DISAS_NORETURN
-                     * when generating an exception, but may use a
-                     * DISAS_TARGET_* value for Something Else.
-                     */
-                    if (db->is_jmp > DISAS_TOO_MANY) {
-                        /*
-                         * The address covered by the breakpoint must be
-                         * included in [tb->pc, tb->pc + tb->size) in order
-                         * to for it to be properly cleared.  Thus we
-                         * increment the PC here so that the logic setting
-                         * tb->size below does the right thing.
-                         */
-                        tcg_debug_assert(len > 0);
-                        db->pc_next += len;
+            /*
+             * When there is a bp hit, we're going to execute a maximum
+             * of one instruction.  The breakpoint_check hook may use
+             * DISAS_NEXT or DISAS_TOO_MANY to indicate that the current
+             * instruction should be translated.  Anything else is taken
+             * to mean that an exception has been raised and that zero
+             * instructions will be executed.
+             */
+            if (db->is_jmp > DISAS_TOO_MANY) {
+                /*
+                 * The address covered by the breakpoint must be
+                 * included in [tb->pc, tb->pc + tb->size) in order
+                 * to for it to be properly cleared.  Thus we
+                 * increment the PC here so that the logic setting
+                 * tb->size below does the right thing.
+                 */
+                tcg_debug_assert(len > 0);
+                db->pc_next += len;
 
-                        /*
-                         * The breakpoint definitely hit, so decrement the
-                         * number of instructions completed for icount.
-                         */
-                        db->num_insns--;
-                        goto done;
-                    }
-                }
+                /*
+                 * The breakpoint definitely hit, so decrement the
+                 * number of instructions completed for icount.
+                 */
+                db->num_insns--;
+                goto done;
             }
         }
 
diff --git a/cpu.c b/cpu.c
index 164fefeaa3..2c9da10d0f 100644
--- a/cpu.c
+++ b/cpu.c
@@ -224,11 +224,6 @@ void tb_invalidate_phys_addr(target_ulong addr)
     tb_invalidate_phys_page_range(addr, addr + 1);
     mmap_unlock();
 }
-
-static void breakpoint_invalidate(CPUState *cpu, target_ulong pc)
-{
-    tb_invalidate_phys_addr(pc);
-}
 #else
 void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs)
 {
@@ -249,17 +244,6 @@ void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs)
     ram_addr = memory_region_get_ram_addr(mr) + addr;
     tb_invalidate_phys_page_range(ram_addr, ram_addr + 1);
 }
-
-static void breakpoint_invalidate(CPUState *cpu, target_ulong pc)
-{
-    /*
-     * There may not be a virtual to physical translation for the pc
-     * right now, but there may exist cached TB for this pc.
-     * Flush the whole TB cache to force re-translation of such TBs.
-     * This is heavyweight, but we're debugging anyway.
-     */
-    tb_flush(cpu);
-}
 #endif
 
 /* Add a breakpoint.  */
@@ -280,8 +264,6 @@ int cpu_breakpoint_insert(CPUState *cpu, vaddr pc, int flags,
         QTAILQ_INSERT_TAIL(&cpu->breakpoints, bp, entry);
     }
 
-    breakpoint_invalidate(cpu, pc);
-
     if (breakpoint) {
         *breakpoint = bp;
     }
@@ -307,8 +289,6 @@ void cpu_breakpoint_remove_by_ref(CPUState *cpu, CPUBreakpoint *breakpoint)
 {
     QTAILQ_REMOVE(&cpu->breakpoints, breakpoint, entry);
 
-    breakpoint_invalidate(cpu, breakpoint->pc);
-
     g_free(breakpoint);
 }
 
@@ -332,10 +312,6 @@ void cpu_single_step(CPUState *cpu, int enabled)
         cpu->singlestep_enabled = enabled;
         if (kvm_enabled()) {
             kvm_update_guest_debug(cpu, 0);
-        } else {
-            /* must flush all the translated code to avoid inconsistencies */
-            /* XXX: only flush what is necessary */
-            tb_flush(cpu);
         }
     }
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 17/17] cpu: Add breakpoint tracepoints
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (15 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 16/17] accel/tcg: Encode breakpoint info into tb->cflags Richard Henderson
@ 2021-07-01 15:25 ` Richard Henderson
  2021-07-01 16:43   ` Philippe Mathieu-Daudé
  2021-07-03  7:00 ` [PATCH 00/17] tcg: breakpoint reorg Mark Cave-Ayland
  17 siblings, 1 reply; 24+ messages in thread
From: Richard Henderson @ 2021-07-01 15:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, mark.cave-ayland, alex.bennee, f4bug

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 cpu.c        | 11 ++++++++---
 trace-events |  5 +++++
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/cpu.c b/cpu.c
index 2c9da10d0f..addcb5db9c 100644
--- a/cpu.c
+++ b/cpu.c
@@ -38,6 +38,7 @@
 #include "exec/translate-all.h"
 #include "exec/log.h"
 #include "hw/core/accel-cpu.h"
+#include "trace/trace-root.h"
 
 uintptr_t qemu_host_page_size;
 intptr_t qemu_host_page_mask;
@@ -267,6 +268,8 @@ int cpu_breakpoint_insert(CPUState *cpu, vaddr pc, int flags,
     if (breakpoint) {
         *breakpoint = bp;
     }
+
+    trace_breakpoint_insert(cpu->cpu_index, pc, flags);
     return 0;
 }
 
@@ -285,11 +288,12 @@ int cpu_breakpoint_remove(CPUState *cpu, vaddr pc, int flags)
 }
 
 /* Remove a specific breakpoint by reference.  */
-void cpu_breakpoint_remove_by_ref(CPUState *cpu, CPUBreakpoint *breakpoint)
+void cpu_breakpoint_remove_by_ref(CPUState *cpu, CPUBreakpoint *bp)
 {
-    QTAILQ_REMOVE(&cpu->breakpoints, breakpoint, entry);
+    QTAILQ_REMOVE(&cpu->breakpoints, bp, entry);
 
-    g_free(breakpoint);
+    trace_breakpoint_remove(cpu->cpu_index, bp->pc, bp->flags);
+    g_free(bp);
 }
 
 /* Remove all matching breakpoints. */
@@ -313,6 +317,7 @@ void cpu_single_step(CPUState *cpu, int enabled)
         if (kvm_enabled()) {
             kvm_update_guest_debug(cpu, 0);
         }
+        trace_breakpoint_singlestep(cpu->cpu_index, enabled);
     }
 }
 
diff --git a/trace-events b/trace-events
index 765fe251e6..c4cca29939 100644
--- a/trace-events
+++ b/trace-events
@@ -25,6 +25,11 @@
 #
 # The <format-string> should be a sprintf()-compatible format string.
 
+# cpu.c
+breakpoint_insert(int cpu_index, uint64_t pc, int flags) "cpu=%d pc=0x%" PRIx64 " flags=0x%x"
+breakpoint_remove(int cpu_index, uint64_t pc, int flags) "cpu=%d pc=0x%" PRIx64 " flags=0x%x"
+breakpoint_singlestep(int cpu_index, int enabled) "cpu=%d enable=%d"
+
 # dma-helpers.c
 dma_blk_io(void *dbs, void *bs, int64_t offset, bool to_dev) "dbs=%p bs=%p offset=%" PRId64 " to_dev=%d"
 dma_aio_cancel(void *dbs) "dbs=%p"
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 02/17] accel/tcg: Move helper_lookup_tb_ptr to cpu-exec.c
  2021-07-01 15:25 ` [PATCH 02/17] accel/tcg: Move helper_lookup_tb_ptr to cpu-exec.c Richard Henderson
@ 2021-07-01 16:33   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 24+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-07-01 16:33 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: alex.bennee, peter.maydell, mark.cave-ayland

On 7/1/21 5:25 PM, Richard Henderson wrote:
> This will allow additional code sharing.
> No functional change.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  accel/tcg/cpu-exec.c    | 30 ++++++++++++++++++++++++++++++
>  accel/tcg/tcg-runtime.c | 22 ----------------------
>  2 files changed, 30 insertions(+), 22 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 03/17] accel/tcg: Move tb_lookup to cpu-exec.c
  2021-07-01 15:25 ` [PATCH 03/17] accel/tcg: Move tb_lookup " Richard Henderson
@ 2021-07-01 16:34   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 24+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-07-01 16:34 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: alex.bennee, peter.maydell, mark.cave-ayland

On 7/1/21 5:25 PM, Richard Henderson wrote:
> Now that we've moved helper_lookup_tb_ptr, the only user
> of tb-lookup.h is cpu-exec.c; merge the contents in.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  accel/tcg/tb-lookup.h | 49 -------------------------------------------
>  accel/tcg/cpu-exec.c  | 31 ++++++++++++++++++++++++++-
>  2 files changed, 30 insertions(+), 50 deletions(-)
>  delete mode 100644 accel/tcg/tb-lookup.h

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 05/17] accel/tcg: Log tb->cflags with -d exec
  2021-07-01 15:25 ` [PATCH 05/17] accel/tcg: Log tb->cflags with -d exec Richard Henderson
@ 2021-07-01 16:36   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 24+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-07-01 16:36 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: alex.bennee, peter.maydell, mark.cave-ayland

On 7/1/21 5:25 PM, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  accel/tcg/cpu-exec.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 06/17] tcg: Remove TCG_TARGET_HAS_goto_ptr
  2021-07-01 15:25 ` [PATCH 06/17] tcg: Remove TCG_TARGET_HAS_goto_ptr Richard Henderson
@ 2021-07-01 16:37   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 24+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-07-01 16:37 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: alex.bennee, peter.maydell, mark.cave-ayland

On 7/1/21 5:25 PM, Richard Henderson wrote:
> Since 6eea04347eb6, all tcg backends support goto_ptr.
> Remove the conditional, making support mandatory.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  include/tcg/tcg-opc.h    | 3 +--
>  tcg/aarch64/tcg-target.h | 1 -
>  tcg/arm/tcg-target.h     | 1 -
>  tcg/i386/tcg-target.h    | 1 -
>  tcg/mips/tcg-target.h    | 1 -
>  tcg/ppc/tcg-target.h     | 1 -
>  tcg/riscv/tcg-target.h   | 1 -
>  tcg/s390/tcg-target.h    | 1 -
>  tcg/sparc/tcg-target.h   | 1 -
>  tcg/tci/tcg-target.h     | 1 -
>  tcg/tcg-op.c             | 2 +-
>  tcg/tcg.c                | 8 ++------
>  12 files changed, 4 insertions(+), 18 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 17/17] cpu: Add breakpoint tracepoints
  2021-07-01 15:25 ` [PATCH 17/17] cpu: Add breakpoint tracepoints Richard Henderson
@ 2021-07-01 16:43   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 24+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-07-01 16:43 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: alex.bennee, peter.maydell, mark.cave-ayland

On 7/1/21 5:25 PM, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  cpu.c        | 11 ++++++++---
>  trace-events |  5 +++++
>  2 files changed, 13 insertions(+), 3 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 00/17] tcg: breakpoint reorg
  2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
                   ` (16 preceding siblings ...)
  2021-07-01 15:25 ` [PATCH 17/17] cpu: Add breakpoint tracepoints Richard Henderson
@ 2021-07-03  7:00 ` Mark Cave-Ayland
  17 siblings, 0 replies; 24+ messages in thread
From: Mark Cave-Ayland @ 2021-07-03  7:00 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: peter.maydell, alex.bennee, f4bug

On 01/07/2021 16:25, Richard Henderson wrote:

> Based-on: <20210630183226.3290849-1-richard.henderson@linaro.org>
> ("[PATCH v2 00/28] accel/tcg: Introduce translator_use_goto_tb")
> 
> This is my attempt at fixing #404 ("windows xp boot takes much longer...").
> 
> I don't actually have windows xp available myself, so I don't know
> if this has worked, really.  I can still boot windows 7, but from
> the lack of tracepoint firings I guess it doesn't play any silly
> games with breakpoints.
> 
> This scheme is not without its drawbacks.  In exchange for no bookkeeping
> and invalidation whatsoever, other code on the same page as an active
> breakpoint runs one insn per tb, doing indirect chaining through
> helper_lookup_tb_ptr to see if we hit the breakpoint.
> 
> The minor testing that I did seemed fast enough though, with gdb
> responding quickly.  So before I go off and try to complicate
> things again with extra bookkeeping, I thought I'd get some feedback.
> 
> 
> r~
> 
> 
> Richard Henderson (17):
>    target/i386: Use cpu_breakpoint_test in breakpoint_handler
>    accel/tcg: Move helper_lookup_tb_ptr to cpu-exec.c
>    accel/tcg: Move tb_lookup to cpu-exec.c
>    accel/tcg: Split out log_cpu_exec
>    accel/tcg: Log tb->cflags with -d exec
>    tcg: Remove TCG_TARGET_HAS_goto_ptr
>    accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
>    accel/tcg: Move curr_cflags into cpu-exec.c
>    accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
>    accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
>    accel/tcg: Handle -singlestep in curr_cflags
>    accel/tcg: Use CF_NO_GOTO_{TB,PTR} in cpu_exec_step_atomic
>    accel/tcg: Move cflags lookup into tb_find
>    accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
>    accel/tcg: Hoist tb_cflags to a local in translator_loop
>    accel/tcg: Encode breakpoint info into tb->cflags
>    cpu: Add breakpoint tracepoints
> 
>   accel/tcg/tb-lookup.h               |  49 ------
>   include/exec/exec-all.h             |  30 ++--
>   include/exec/translator.h           |  17 +-
>   include/tcg/tcg-opc.h               |   3 +-
>   tcg/aarch64/tcg-target.h            |   1 -
>   tcg/arm/tcg-target.h                |   1 -
>   tcg/i386/tcg-target.h               |   1 -
>   tcg/mips/tcg-target.h               |   1 -
>   tcg/ppc/tcg-target.h                |   1 -
>   tcg/riscv/tcg-target.h              |   1 -
>   tcg/s390/tcg-target.h               |   1 -
>   tcg/sparc/tcg-target.h              |   1 -
>   tcg/tci/tcg-target.h                |   1 -
>   accel/tcg/cpu-exec.c                | 238 +++++++++++++++++++++++-----
>   accel/tcg/tcg-runtime.c             |  22 ---
>   accel/tcg/translate-all.c           |   7 +-
>   accel/tcg/translator.c              |  79 ++++++---
>   cpu.c                               |  35 +---
>   target/alpha/translate.c            |  12 +-
>   target/arm/translate-a64.c          |  14 +-
>   target/arm/translate.c              |  20 +--
>   target/avr/translate.c              |   6 +-
>   target/cris/translate.c             |  14 +-
>   target/hexagon/translate.c          |  13 +-
>   target/hppa/translate.c             |   7 +-
>   target/i386/tcg/sysemu/bpt_helper.c |  12 +-
>   target/i386/tcg/translate.c         |  15 +-
>   target/m68k/translate.c             |  14 +-
>   target/microblaze/translate.c       |  14 +-
>   target/mips/tcg/translate.c         |  14 +-
>   target/nios2/translate.c            |  13 +-
>   target/openrisc/translate.c         |  11 +-
>   target/ppc/translate.c              |  13 +-
>   target/riscv/translate.c            |  11 +-
>   target/rx/translate.c               |   8 +-
>   target/s390x/translate.c            |  12 +-
>   target/sh4/translate.c              |  12 +-
>   target/sparc/translate.c            |   9 +-
>   target/tricore/translate.c          |  13 +-
>   target/xtensa/translate.c           |  12 +-
>   tcg/tcg-op.c                        |  28 ++--
>   tcg/tcg.c                           |   8 +-
>   trace-events                        |   5 +
>   43 files changed, 386 insertions(+), 413 deletions(-)
>   delete mode 100644 accel/tcg/tb-lookup.h

Thanks Richard! I grabbed the git tag from patchew and gave it a quick smoke test 
booting a pre-installed WinXP image to the login screen. I've included some extra 
times below for comparison taken from my rather modest laptop:

b55f54bc~1 (i.e. last known good commit): 43s
current master (9c2647f750): 2m 40s
breakpoint reorg patchew tag: 25s(!)

I can certainly report that booting WinXP from the patchew tag is substantially 
faster than both git master and the last known good commit before the TLB flush logic 
was altered, so the initial results are extremely promising.


ATB,

Mark.


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-07-03  7:02 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-01 15:25 [PATCH 00/17] tcg: breakpoint reorg Richard Henderson
2021-07-01 15:25 ` [PATCH 01/17] target/i386: Use cpu_breakpoint_test in breakpoint_handler Richard Henderson
2021-07-01 15:25 ` [PATCH 02/17] accel/tcg: Move helper_lookup_tb_ptr to cpu-exec.c Richard Henderson
2021-07-01 16:33   ` Philippe Mathieu-Daudé
2021-07-01 15:25 ` [PATCH 03/17] accel/tcg: Move tb_lookup " Richard Henderson
2021-07-01 16:34   ` Philippe Mathieu-Daudé
2021-07-01 15:25 ` [PATCH 04/17] accel/tcg: Split out log_cpu_exec Richard Henderson
2021-07-01 15:25 ` [PATCH 05/17] accel/tcg: Log tb->cflags with -d exec Richard Henderson
2021-07-01 16:36   ` Philippe Mathieu-Daudé
2021-07-01 15:25 ` [PATCH 06/17] tcg: Remove TCG_TARGET_HAS_goto_ptr Richard Henderson
2021-07-01 16:37   ` Philippe Mathieu-Daudé
2021-07-01 15:25 ` [PATCH 07/17] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
2021-07-01 15:25 ` [PATCH 08/17] accel/tcg: Move curr_cflags into cpu-exec.c Richard Henderson
2021-07-01 15:25 ` [PATCH 09/17] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR Richard Henderson
2021-07-01 15:25 ` [PATCH 10/17] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain Richard Henderson
2021-07-01 15:25 ` [PATCH 11/17] accel/tcg: Handle -singlestep in curr_cflags Richard Henderson
2021-07-01 15:25 ` [PATCH 12/17] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic Richard Henderson
2021-07-01 15:25 ` [PATCH 13/17] accel/tcg: Move cflags lookup into tb_find Richard Henderson
2021-07-01 15:25 ` [PATCH 14/17] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check Richard Henderson
2021-07-01 15:25 ` [PATCH 15/17] accel/tcg: Hoist tb_cflags to a local in translator_loop Richard Henderson
2021-07-01 15:25 ` [PATCH 16/17] accel/tcg: Encode breakpoint info into tb->cflags Richard Henderson
2021-07-01 15:25 ` [PATCH 17/17] cpu: Add breakpoint tracepoints Richard Henderson
2021-07-01 16:43   ` Philippe Mathieu-Daudé
2021-07-03  7:00 ` [PATCH 00/17] tcg: breakpoint reorg Mark Cave-Ayland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).