* [PATCH v2 00/10] tcg: breakpoint reorg
@ 2021-07-12 15:39 Richard Henderson
2021-07-12 15:39 ` [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
` (11 more replies)
0 siblings, 12 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:39 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug
This is fixing #404 ("windows xp boot takes much longer...")
and several other similar reports.
For v2, all prerequisites and 7 of the patches from v1 with
reviews are now upstream.
Mark Cave-Ayland reported success with WinXP with v1, with
this patch set being even faster than b55f54bc~1. Which was
a bit of a surprise, but I'll take it. It means that it's
probably not worth making the breakpoint detection scheme
any more complicated.
I'd still like some more feedback. Given this is fixing a
regression from qemu 5.2 I feel comfortable delaying this
past soft freeze, but not past hard freeze on the 20th.
r~
Richard Henderson (10):
accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
accel/tcg: Move curr_cflags into cpu-exec.c
accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
accel/tcg: Handle -singlestep in curr_cflags
accel/tcg: Use CF_NO_GOTO_{TB,PTR} in cpu_exec_step_atomic
accel/tcg: Move cflags lookup into tb_find
accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
accel/tcg: Hoist tb_cflags to a local in translator_loop
accel/tcg: Encode breakpoint info into tb->cflags
include/exec/exec-all.h | 30 +++++---
include/exec/translator.h | 17 +++--
accel/tcg/cpu-exec.c | 130 ++++++++++++++++++++++++++++------
accel/tcg/translate-all.c | 7 +-
accel/tcg/translator.c | 79 ++++++++++++++-------
cpu.c | 24 -------
target/alpha/translate.c | 12 +---
target/arm/translate-a64.c | 14 ++--
target/arm/translate.c | 20 +++---
target/avr/translate.c | 6 +-
target/cris/translate.c | 14 ++--
target/hexagon/translate.c | 13 +---
target/hppa/translate.c | 7 +-
target/i386/tcg/translate.c | 15 ++--
target/m68k/translate.c | 14 +---
target/microblaze/translate.c | 14 +---
target/mips/tcg/translate.c | 14 ++--
target/nios2/translate.c | 13 +---
target/openrisc/translate.c | 11 +--
target/ppc/translate.c | 13 +---
target/riscv/translate.c | 11 +--
target/rx/translate.c | 8 +--
target/s390x/translate.c | 12 ++--
target/sh4/translate.c | 12 ++--
target/sparc/translate.c | 9 ++-
target/tricore/translate.c | 13 +---
target/xtensa/translate.c | 12 ++--
tcg/tcg-op.c | 28 ++++----
28 files changed, 280 insertions(+), 292 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
@ 2021-07-12 15:39 ` Richard Henderson
2021-07-17 17:30 ` Peter Maydell
2021-07-17 17:33 ` Alex Bennée
2021-07-12 15:39 ` [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c Richard Henderson
` (10 subsequent siblings)
11 siblings, 2 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:39 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug
The space reserved for CF_COUNT_MASK was overly large.
Reduce to free up cflags bits and eliminate an extra test.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/exec/exec-all.h | 4 +++-
accel/tcg/translate-all.c | 5 ++---
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 754f4130c9..dfe82ed19c 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -492,7 +492,9 @@ struct TranslationBlock {
target_ulong cs_base; /* CS base for this block */
uint32_t flags; /* flags defining in which context the code was generated */
uint32_t cflags; /* compile flags */
-#define CF_COUNT_MASK 0x00007fff
+
+/* Note that TCG_MAX_INSNS is 512; we validate this match elsewhere. */
+#define CF_COUNT_MASK 0x000001ff
#define CF_LAST_IO 0x00008000 /* Last insn may be an IO access. */
#define CF_MEMI_ONLY 0x00010000 /* Only instrument memory ops */
#define CF_USE_ICOUNT 0x00020000
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 4df26de858..997e44c73b 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1430,9 +1430,8 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
}
- if (max_insns > TCG_MAX_INSNS) {
- max_insns = TCG_MAX_INSNS;
- }
+ QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
+
if (cpu->singlestep_enabled || singlestep) {
max_insns = 1;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
2021-07-12 15:39 ` [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
@ 2021-07-12 15:39 ` Richard Henderson
2021-07-17 17:32 ` Peter Maydell
2021-07-17 17:34 ` Alex Bennée
2021-07-12 15:39 ` [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR Richard Henderson
` (9 subsequent siblings)
11 siblings, 2 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:39 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug
We will shortly have more than a simple member read here,
with stuff not necessarily exposed to exec/exec-all.h.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/exec/exec-all.h | 5 +----
accel/tcg/cpu-exec.c | 5 +++++
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index dfe82ed19c..ae7603ca75 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -565,10 +565,7 @@ static inline uint32_t tb_cflags(const TranslationBlock *tb)
}
/* current cflags for hashing/comparison */
-static inline uint32_t curr_cflags(CPUState *cpu)
-{
- return cpu->tcg_cflags;
-}
+uint32_t curr_cflags(CPUState *cpu);
/* TranslationBlock invalidate API */
#if defined(CONFIG_USER_ONLY)
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index e22bcb99f7..ef4214d893 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -145,6 +145,11 @@ static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
}
#endif /* CONFIG USER ONLY */
+uint32_t curr_cflags(CPUState *cpu)
+{
+ return cpu->tcg_cflags;
+}
+
/* Might cause an exception, so have a longjmp destination ready */
static inline TranslationBlock *tb_lookup(CPUState *cpu, target_ulong pc,
target_ulong cs_base,
--
2.25.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
2021-07-12 15:39 ` [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
2021-07-12 15:39 ` [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c Richard Henderson
@ 2021-07-12 15:39 ` Richard Henderson
2021-07-17 17:36 ` Alex Bennée
2021-07-17 17:36 ` Peter Maydell
2021-07-12 15:39 ` [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain Richard Henderson
` (8 subsequent siblings)
11 siblings, 2 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:39 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug
Move the -d nochain check to bits on tb->cflags.
These will be used for more than -d nochain shortly.
Set bits during curr_cflags, test them in translator_use_goto_tb,
assert we're not doing anything odd in tcg_gen_goto_tb. The test
in tcg_gen_exit_tb is redundant with the assert for goto_tb_issue_mask.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/exec/exec-all.h | 16 +++++++++-------
accel/tcg/cpu-exec.c | 8 +++++++-
accel/tcg/translator.c | 5 +++++
tcg/tcg-op.c | 28 ++++++++++++----------------
4 files changed, 33 insertions(+), 24 deletions(-)
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index ae7603ca75..6873cce8df 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -494,13 +494,15 @@ struct TranslationBlock {
uint32_t cflags; /* compile flags */
/* Note that TCG_MAX_INSNS is 512; we validate this match elsewhere. */
-#define CF_COUNT_MASK 0x000001ff
-#define CF_LAST_IO 0x00008000 /* Last insn may be an IO access. */
-#define CF_MEMI_ONLY 0x00010000 /* Only instrument memory ops */
-#define CF_USE_ICOUNT 0x00020000
-#define CF_INVALID 0x00040000 /* TB is stale. Set with @jmp_lock held */
-#define CF_PARALLEL 0x00080000 /* Generate code for a parallel context */
-#define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */
+#define CF_COUNT_MASK 0x000001ff
+#define CF_NO_GOTO_TB 0x00000200 /* Do not chain with goto_tb */
+#define CF_NO_GOTO_PTR 0x00000400 /* Do not chain with goto_ptr */
+#define CF_LAST_IO 0x00008000 /* Last insn may be an IO access. */
+#define CF_MEMI_ONLY 0x00010000 /* Only instrument memory ops */
+#define CF_USE_ICOUNT 0x00020000
+#define CF_INVALID 0x00040000 /* TB is stale. Set with @jmp_lock held */
+#define CF_PARALLEL 0x00080000 /* Generate code for a parallel context */
+#define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */
#define CF_CLUSTER_SHIFT 24
/* Per-vCPU dynamic tracing state used to generate this TB */
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index ef4214d893..d3232d5764 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -147,7 +147,13 @@ static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
uint32_t curr_cflags(CPUState *cpu)
{
- return cpu->tcg_cflags;
+ uint32_t cflags = cpu->tcg_cflags;
+
+ if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
+ cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
+ }
+
+ return cflags;
}
/* Might cause an exception, so have a longjmp destination ready */
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 59804af37b..2ea5a74f30 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -33,6 +33,11 @@ void translator_loop_temp_check(DisasContextBase *db)
bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
{
+ /* Suppress goto_tb if requested. */
+ if (tb_cflags(db->tb) & CF_NO_GOTO_TB) {
+ return false;
+ }
+
/* Suppress goto_tb in the case of single-steping. */
if (db->singlestep_enabled || singlestep) {
return false;
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 0c561fb253..e0d54d537f 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -2723,10 +2723,6 @@ void tcg_gen_exit_tb(const TranslationBlock *tb, unsigned idx)
seen this numbered exit before, via tcg_gen_goto_tb. */
tcg_debug_assert(tcg_ctx->goto_tb_issue_mask & (1 << idx));
#endif
- /* When not chaining, exit without indicating a link. */
- if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
- val = 0;
- }
} else {
/* This is an exit via the exitreq label. */
tcg_debug_assert(idx == TB_EXIT_REQUESTED);
@@ -2738,6 +2734,8 @@ void tcg_gen_exit_tb(const TranslationBlock *tb, unsigned idx)
void tcg_gen_goto_tb(unsigned idx)
{
+ /* We tested CF_NO_GOTO_TB in translator_use_goto_tb. */
+ tcg_debug_assert(!(tcg_ctx->tb_cflags & CF_NO_GOTO_TB));
/* We only support two chained exits. */
tcg_debug_assert(idx <= TB_EXIT_IDXMAX);
#ifdef CONFIG_DEBUG_TCG
@@ -2746,25 +2744,23 @@ void tcg_gen_goto_tb(unsigned idx)
tcg_ctx->goto_tb_issue_mask |= 1 << idx;
#endif
plugin_gen_disable_mem_helpers();
- /* When not chaining, we simply fall through to the "fallback" exit. */
- if (!qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
- tcg_gen_op1i(INDEX_op_goto_tb, idx);
- }
+ tcg_gen_op1i(INDEX_op_goto_tb, idx);
}
void tcg_gen_lookup_and_goto_ptr(void)
{
- if (!qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
- TCGv_ptr ptr;
+ TCGv_ptr ptr;
- plugin_gen_disable_mem_helpers();
- ptr = tcg_temp_new_ptr();
- gen_helper_lookup_tb_ptr(ptr, cpu_env);
- tcg_gen_op1i(INDEX_op_goto_ptr, tcgv_ptr_arg(ptr));
- tcg_temp_free_ptr(ptr);
- } else {
+ if (tcg_ctx->tb_cflags & CF_NO_GOTO_PTR) {
tcg_gen_exit_tb(NULL, 0);
+ return;
}
+
+ plugin_gen_disable_mem_helpers();
+ ptr = tcg_temp_new_ptr();
+ gen_helper_lookup_tb_ptr(ptr, cpu_env);
+ tcg_gen_op1i(INDEX_op_goto_ptr, tcgv_ptr_arg(ptr));
+ tcg_temp_free_ptr(ptr);
}
static inline MemOp tcg_canonicalize_memop(MemOp op, bool is64, bool st)
--
2.25.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
` (2 preceding siblings ...)
2021-07-12 15:39 ` [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR Richard Henderson
@ 2021-07-12 15:39 ` Richard Henderson
2021-07-17 17:39 ` Peter Maydell
2021-07-17 17:39 ` Alex Bennée
2021-07-12 15:39 ` [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags Richard Henderson
` (7 subsequent siblings)
11 siblings, 2 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:39 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug
The purpose of suppressing goto_ptr from -d nochain had been
to return to the main loop so that -d cpu would be recognized.
But we now include -d cpu logging in helper_lookup_tb_ptr so
there is no need to exclude goto_ptr.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/cpu-exec.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index d3232d5764..70ea3c7d68 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -150,7 +150,7 @@ uint32_t curr_cflags(CPUState *cpu)
uint32_t cflags = cpu->tcg_cflags;
if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
- cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
+ cflags |= CF_NO_GOTO_TB;
}
return cflags;
--
2.25.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
` (3 preceding siblings ...)
2021-07-12 15:39 ` [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain Richard Henderson
@ 2021-07-12 15:39 ` Richard Henderson
2021-07-17 17:41 ` Peter Maydell
2021-07-17 17:42 ` Alex Bennée
2021-07-12 15:40 ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic Richard Henderson
` (6 subsequent siblings)
11 siblings, 2 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:39 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug
Exchange the test in translator_use_goto_tb for CF_NO_GOTO_TB,
and the test in tb_gen_code for setting CF_COUNT_MASK to 1.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/cpu-exec.c | 8 +++++++-
accel/tcg/translate-all.c | 2 +-
accel/tcg/translator.c | 2 +-
3 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 70ea3c7d68..2206c463f5 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -149,7 +149,13 @@ uint32_t curr_cflags(CPUState *cpu)
{
uint32_t cflags = cpu->tcg_cflags;
- if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
+ /*
+ * For singlestep and -d nochain, suppress goto_tb so that
+ * we can log -d cpu,exec after every TB.
+ */
+ if (singlestep) {
+ cflags |= CF_NO_GOTO_TB | 1;
+ } else if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
cflags |= CF_NO_GOTO_TB;
}
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 997e44c73b..491c1a56b2 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1432,7 +1432,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
}
QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
- if (cpu->singlestep_enabled || singlestep) {
+ if (cpu->singlestep_enabled) {
max_insns = 1;
}
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 2ea5a74f30..a59eb7c11b 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -39,7 +39,7 @@ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
}
/* Suppress goto_tb in the case of single-steping. */
- if (db->singlestep_enabled || singlestep) {
+ if (db->singlestep_enabled) {
return false;
}
--
2.25.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
` (4 preceding siblings ...)
2021-07-12 15:39 ` [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags Richard Henderson
@ 2021-07-12 15:40 ` Richard Henderson
2021-07-17 17:43 ` Peter Maydell
2021-07-17 17:47 ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB,PTR} " Alex Bennée
2021-07-12 15:40 ` [PATCH v2 07/10] accel/tcg: Move cflags lookup into tb_find Richard Henderson
` (5 subsequent siblings)
11 siblings, 2 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:40 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug
Request that the one TB returns immediately, so that
we release the exclusive lock as soon as possible.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/cpu-exec.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 2206c463f5..5bb099174f 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -330,8 +330,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
CPUArchState *env = (CPUArchState *)cpu->env_ptr;
TranslationBlock *tb;
target_ulong cs_base, pc;
- uint32_t flags;
- uint32_t cflags = (curr_cflags(cpu) & ~CF_PARALLEL) | 1;
+ uint32_t flags, cflags;
int tb_exit;
if (sigsetjmp(cpu->jmp_env, 0) == 0) {
@@ -341,8 +340,14 @@ void cpu_exec_step_atomic(CPUState *cpu)
cpu->running = true;
cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
- tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
+ cflags = curr_cflags(cpu);
+ /* Execute in a serial context. */
+ cflags &= ~CF_PARALLEL;
+ /* After 1 insn, return and release the exclusive lock. */
+ cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR | 1;
+
+ tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
if (tb == NULL) {
mmap_lock();
tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
--
2.25.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v2 07/10] accel/tcg: Move cflags lookup into tb_find
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
` (5 preceding siblings ...)
2021-07-12 15:40 ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic Richard Henderson
@ 2021-07-12 15:40 ` Richard Henderson
2021-07-17 17:45 ` Peter Maydell
2021-07-12 15:40 ` [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check Richard Henderson
` (4 subsequent siblings)
11 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:40 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug
We will shortly require the guest pc for computing cflags,
so move the choice just after cpu_get_tb_cpu_state.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/cpu-exec.c | 34 +++++++++++++++++-----------------
1 file changed, 17 insertions(+), 17 deletions(-)
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 5bb099174f..4d043a11aa 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -502,15 +502,29 @@ static inline void tb_add_jump(TranslationBlock *tb, int n,
static inline TranslationBlock *tb_find(CPUState *cpu,
TranslationBlock *last_tb,
- int tb_exit, uint32_t cflags)
+ int tb_exit)
{
CPUArchState *env = (CPUArchState *)cpu->env_ptr;
TranslationBlock *tb;
target_ulong cs_base, pc;
- uint32_t flags;
+ uint32_t flags, cflags;
cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
+ /*
+ * When requested, use an exact setting for cflags for the next
+ * execution. This is used for icount, precise smc, and stop-
+ * after-access watchpoints. Since this request should never
+ * have CF_INVALID set, -1 is a convenient invalid value that
+ * does not require tcg headers for cpu_common_reset.
+ */
+ cflags = cpu->cflags_next_tb;
+ if (cflags == -1) {
+ cflags = curr_cflags(cpu);
+ } else {
+ cpu->cflags_next_tb = -1;
+ }
+
tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
if (tb == NULL) {
mmap_lock();
@@ -868,21 +882,7 @@ int cpu_exec(CPUState *cpu)
int tb_exit = 0;
while (!cpu_handle_interrupt(cpu, &last_tb)) {
- uint32_t cflags = cpu->cflags_next_tb;
- TranslationBlock *tb;
-
- /* When requested, use an exact setting for cflags for the next
- execution. This is used for icount, precise smc, and stop-
- after-access watchpoints. Since this request should never
- have CF_INVALID set, -1 is a convenient invalid value that
- does not require tcg headers for cpu_common_reset. */
- if (cflags == -1) {
- cflags = curr_cflags(cpu);
- } else {
- cpu->cflags_next_tb = -1;
- }
-
- tb = tb_find(cpu, last_tb, tb_exit, cflags);
+ TranslationBlock *tb = tb_find(cpu, last_tb, tb_exit);
cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit);
/* Try to align the host and virtual clocks
if the guest is in advance */
--
2.25.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
` (6 preceding siblings ...)
2021-07-12 15:40 ` [PATCH v2 07/10] accel/tcg: Move cflags lookup into tb_find Richard Henderson
@ 2021-07-12 15:40 ` Richard Henderson
2021-07-17 17:52 ` Peter Maydell
2021-07-12 15:40 ` [PATCH v2 09/10] accel/tcg: Hoist tb_cflags to a local in translator_loop Richard Henderson
` (3 subsequent siblings)
11 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:40 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug
We don't need the whole CPUBreakpoint structure in the check,
only the flags. Return the instruction length to consolidate
the adjustment of db->pc_next.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/exec/translator.h | 17 +++++++++------
accel/tcg/translator.c | 40 ++++++++++++++++++++++++-----------
target/alpha/translate.c | 12 +++--------
target/arm/translate-a64.c | 14 ++++--------
target/arm/translate.c | 20 +++++++-----------
target/avr/translate.c | 6 +++---
target/cris/translate.c | 14 ++++--------
target/hexagon/translate.c | 13 +++---------
target/hppa/translate.c | 7 +++---
target/i386/tcg/translate.c | 15 ++++---------
target/m68k/translate.c | 14 +++---------
target/microblaze/translate.c | 14 +++---------
target/mips/tcg/translate.c | 14 ++++--------
target/nios2/translate.c | 13 +++---------
target/openrisc/translate.c | 11 +++-------
target/ppc/translate.c | 13 +++---------
target/riscv/translate.c | 11 +++-------
target/rx/translate.c | 8 +++----
target/s390x/translate.c | 12 ++++-------
target/sh4/translate.c | 12 ++++-------
target/sparc/translate.c | 9 ++++----
target/tricore/translate.c | 13 +++---------
target/xtensa/translate.c | 12 ++++-------
23 files changed, 115 insertions(+), 199 deletions(-)
diff --git a/include/exec/translator.h b/include/exec/translator.h
index dd9c06d40d..433b753c5c 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -92,11 +92,15 @@ typedef struct DisasContextBase {
* @breakpoint_check:
* When called, the breakpoint has already been checked to match the PC,
* but the target may decide the breakpoint missed the address
- * (e.g., due to conditions encoded in their flags). Return true to
- * indicate that the breakpoint did hit, in which case no more breakpoints
- * are checked. If the breakpoint did hit, emit any code required to
- * signal the exception, and set db->is_jmp as necessary to terminate
- * the main loop.
+ * (e.g., due to conditions encoded in their flags), in which case
+ * db->is_jmp may be left as DISAS_NEXT or DISAS_TOO_MANY to indicate
+ * that the insn should be translated. Anything other than those two
+ * will be taken to indicate an exception has been raised, but in most
+ * cases db->is_jmp should be set to DISAS_NORETURN.
+ *
+ * Return the minimum instruction size that should be applied to the TB.
+ * The size of any TB cannot be zero, as that breaks the math used to
+ * invalidate TBs.
*
* @translate_insn:
* Disassemble one instruction and set db->pc_next for the start
@@ -113,8 +117,7 @@ typedef struct TranslatorOps {
void (*init_disas_context)(DisasContextBase *db, CPUState *cpu);
void (*tb_start)(DisasContextBase *db, CPUState *cpu);
void (*insn_start)(DisasContextBase *db, CPUState *cpu);
- bool (*breakpoint_check)(DisasContextBase *db, CPUState *cpu,
- const CPUBreakpoint *bp);
+ int (*breakpoint_check)(DisasContextBase *db, CPUState *cpu, int flags);
void (*translate_insn)(DisasContextBase *db, CPUState *cpu);
void (*tb_stop)(DisasContextBase *db, CPUState *cpu);
void (*disas_log)(const DisasContextBase *db, CPUState *cpu);
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index a59eb7c11b..1c44d096d8 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -50,7 +50,6 @@ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
CPUState *cpu, TranslationBlock *tb, int max_insns)
{
- int bp_insn = 0;
bool plugin_enabled;
/* Initialize DisasContext */
@@ -91,19 +90,35 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
CPUBreakpoint *bp;
QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) {
if (bp->pc == db->pc_next) {
- if (ops->breakpoint_check(db, cpu, bp)) {
- bp_insn = 1;
- break;
+ int len = ops->breakpoint_check(db, cpu, bp->flags);
+
+ /*
+ * The breakpoint_check hook may use DISAS_TOO_MANY
+ * to indicate that only one more instruction is to
+ * be executed. Otherwise it should use DISAS_NORETURN
+ * when generating an exception, but may use a
+ * DISAS_TARGET_* value for Something Else.
+ */
+ if (db->is_jmp > DISAS_TOO_MANY) {
+ /*
+ * The address covered by the breakpoint must be
+ * included in [tb->pc, tb->pc + tb->size) in order
+ * to for it to be properly cleared. Thus we
+ * increment the PC here so that the logic setting
+ * tb->size below does the right thing.
+ */
+ tcg_debug_assert(len > 0);
+ db->pc_next += len;
+
+ /*
+ * The breakpoint definitely hit, so decrement the
+ * number of instructions completed for icount.
+ */
+ db->num_insns--;
+ goto done;
}
}
}
- /* The breakpoint_check hook may use DISAS_TOO_MANY to indicate
- that only one more instruction is to be executed. Otherwise
- it should use DISAS_NORETURN when generating an exception,
- but may use a DISAS_TARGET_* value for Something Else. */
- if (db->is_jmp > DISAS_TOO_MANY) {
- break;
- }
}
/* Disassemble one instruction. The translate_insn hook should
@@ -142,9 +157,10 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
}
}
+ done:
/* Emit code to exit the TB, as indicated by db->is_jmp. */
ops->tb_stop(db, cpu);
- gen_tb_end(db->tb, db->num_insns - bp_insn);
+ gen_tb_end(db->tb, db->num_insns);
if (plugin_enabled) {
plugin_gen_tb_end(cpu);
diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index 833d3baa7b..07d0197cfb 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -3012,19 +3012,13 @@ static void alpha_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
tcg_gen_insn_start(dcbase->pc_next);
}
-static bool alpha_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
- const CPUBreakpoint *bp)
+static int alpha_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+ int bp_flags)
{
DisasContext *ctx = container_of(dcbase, DisasContext, base);
ctx->base.is_jmp = gen_excp(ctx, EXCP_DEBUG, 0);
-
- /* The address covered by the breakpoint must be included in
- [tb->pc, tb->pc + tb->size) in order to for it to be
- properly cleared -- thus we increment the PC here so that
- the logic setting tb->size below does the right thing. */
- ctx->base.pc_next += 4;
- return true;
+ return 4; /* minimum instruction length */
}
static void alpha_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index ca11a5fecd..4efd4e95d6 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -14844,28 +14844,22 @@ static void aarch64_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
dc->insn_start = tcg_last_op();
}
-static bool aarch64_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
- const CPUBreakpoint *bp)
+static int aarch64_tr_breakpoint_check(DisasContextBase *dcbase,
+ CPUState *cpu, int bp_flags)
{
DisasContext *dc = container_of(dcbase, DisasContext, base);
- if (bp->flags & BP_CPU) {
+ if (bp_flags & BP_CPU) {
gen_a64_set_pc_im(dc->base.pc_next);
gen_helper_check_breakpoints(cpu_env);
/* End the TB early; it likely won't be executed */
dc->base.is_jmp = DISAS_TOO_MANY;
} else {
gen_exception_internal_insn(dc, dc->base.pc_next, EXCP_DEBUG);
- /* The address covered by the breakpoint must be
- included in [tb->pc, tb->pc + tb->size) in order
- to for it to be properly cleared -- thus we
- increment the PC here so that the logic setting
- tb->size below does the right thing. */
- dc->base.pc_next += 4;
dc->base.is_jmp = DISAS_NORETURN;
}
- return true;
+ return 4; /* minimum instruction length */
}
static void aarch64_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/arm/translate.c b/target/arm/translate.c
index e1a8152598..ebac31c3ac 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9438,12 +9438,12 @@ static void arm_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
dc->insn_start = tcg_last_op();
}
-static bool arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
- const CPUBreakpoint *bp)
+static int arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+ int bp_flags)
{
DisasContext *dc = container_of(dcbase, DisasContext, base);
- if (bp->flags & BP_CPU) {
+ if (bp_flags & BP_CPU) {
gen_set_condexec(dc);
gen_set_pc_im(dc, dc->base.pc_next);
gen_helper_check_breakpoints(cpu_env);
@@ -9451,18 +9451,14 @@ static bool arm_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
dc->base.is_jmp = DISAS_TOO_MANY;
} else {
gen_exception_internal_insn(dc, dc->base.pc_next, EXCP_DEBUG);
- /* The address covered by the breakpoint must be
- included in [tb->pc, tb->pc + tb->size) in order
- to for it to be properly cleared -- thus we
- increment the PC here so that the logic setting
- tb->size below does the right thing. */
- /* TODO: Advance PC by correct instruction length to
- * avoid disassembler error messages */
- dc->base.pc_next += 2;
dc->base.is_jmp = DISAS_NORETURN;
}
- return true;
+ /*
+ * TODO: Advance PC by correct instruction length to avoid disassembler
+ * error messages. In the meantime, minimum instruction length.
+ */
+ return 2;
}
static bool arm_pre_translate_insn(DisasContext *dc)
diff --git a/target/avr/translate.c b/target/avr/translate.c
index 8237a03c23..73ff467926 100644
--- a/target/avr/translate.c
+++ b/target/avr/translate.c
@@ -2944,13 +2944,13 @@ static void avr_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
tcg_gen_insn_start(ctx->npc);
}
-static bool avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
- const CPUBreakpoint *bp)
+static int avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+ int bp_flags)
{
DisasContext *ctx = container_of(dcbase, DisasContext, base);
gen_breakpoint(ctx);
- return true;
+ return 2; /* minimum instruction length */
}
static void avr_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/cris/translate.c b/target/cris/translate.c
index 9258c13e9f..b590e79433 100644
--- a/target/cris/translate.c
+++ b/target/cris/translate.c
@@ -3118,8 +3118,8 @@ static void cris_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
tcg_gen_insn_start(dc->delayed_branch == 1 ? dc->ppc | 1 : dc->pc);
}
-static bool cris_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
- const CPUBreakpoint *bp)
+static int cris_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+ int bp_flags)
{
DisasContext *dc = container_of(dcbase, DisasContext, base);
@@ -3127,14 +3127,8 @@ static bool cris_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
tcg_gen_movi_tl(env_pc, dc->pc);
t_gen_raise_exception(EXCP_DEBUG);
dc->base.is_jmp = DISAS_NORETURN;
- /*
- * The address covered by the breakpoint must be included in
- * [tb->pc, tb->pc + tb->size) in order to for it to be
- * properly cleared -- thus we increment the PC here so that
- * the logic setting tb->size below does the right thing.
- */
- dc->pc += 2;
- return true;
+
+ return 2; /* minimum instruction length */
}
static void cris_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index b23d36adf5..75c0d40a13 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -540,20 +540,13 @@ static void hexagon_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
tcg_gen_insn_start(ctx->base.pc_next);
}
-static bool hexagon_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
- const CPUBreakpoint *bp)
+static int hexagon_tr_breakpoint_check(DisasContextBase *dcbase,
+ CPUState *cpu, int bp_flags)
{
DisasContext *ctx = container_of(dcbase, DisasContext, base);
gen_exception_end_tb(ctx, EXCP_DEBUG);
- /*
- * The address covered by the breakpoint must be included in
- * [tb->pc, tb->pc + tb->size) in order to for it to be
- * properly cleared -- thus we increment the PC here so that
- * the logic setting tb->size below does the right thing.
- */
- ctx->base.pc_next += 4;
- return true;
+ return 4; /* minimum packet length */
}
static bool pkt_crosses_page(CPUHexagonState *env, DisasContext *ctx)
diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 835120c038..407136f26b 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -4204,14 +4204,13 @@ static void hppa_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
tcg_gen_insn_start(ctx->iaoq_f, ctx->iaoq_b);
}
-static bool hppa_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
- const CPUBreakpoint *bp)
+static int hppa_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+ int bp_flags)
{
DisasContext *ctx = container_of(dcbase, DisasContext, base);
gen_excp(ctx, EXCP_DEBUG);
- ctx->base.pc_next += 4;
- return true;
+ return 4; /* minimum instruction length */
}
static void hppa_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 3814ce2a3e..bf5a393b9b 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -8583,23 +8583,16 @@ static void i386_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
tcg_gen_insn_start(dc->base.pc_next, dc->cc_op);
}
-static bool i386_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
- const CPUBreakpoint *bp)
+static int i386_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+ int bp_flags)
{
DisasContext *dc = container_of(dcbase, DisasContext, base);
/* If RF is set, suppress an internally generated breakpoint. */
int flags = dc->base.tb->flags & HF_RF_MASK ? BP_GDB : BP_ANY;
- if (bp->flags & flags) {
+ if (bp_flags & flags) {
gen_debug(dc);
- /* The address covered by the breakpoint must be included in
- [tb->pc, tb->pc + tb->size) in order to for it to be
- properly cleared -- thus we increment the PC here so that
- the generic logic setting tb->size later does the right thing. */
- dc->base.pc_next += 1;
- return true;
- } else {
- return false;
}
+ return 1; /* minimum instruction length */
}
static void i386_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/m68k/translate.c b/target/m68k/translate.c
index 1fee04b8dd..79c1847d54 100644
--- a/target/m68k/translate.c
+++ b/target/m68k/translate.c
@@ -6208,21 +6208,13 @@ static void m68k_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
tcg_gen_insn_start(dc->base.pc_next, dc->cc_op);
}
-static bool m68k_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
- const CPUBreakpoint *bp)
+static int m68k_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+ int bp_flags)
{
DisasContext *dc = container_of(dcbase, DisasContext, base);
gen_exception(dc, dc->base.pc_next, EXCP_DEBUG);
- /*
- * The address covered by the breakpoint must be included in
- * [tb->pc, tb->pc + tb->size) in order to for it to be
- * properly cleared -- thus we increment the PC here so that
- * the logic setting tb->size below does the right thing.
- */
- dc->base.pc_next += 2;
-
- return true;
+ return 2; /* minimum instruction length */
}
static void m68k_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/microblaze/translate.c b/target/microblaze/translate.c
index c68a84a219..09a364cceb 100644
--- a/target/microblaze/translate.c
+++ b/target/microblaze/translate.c
@@ -1673,21 +1673,13 @@ static void mb_tr_insn_start(DisasContextBase *dcb, CPUState *cs)
dc->insn_start = tcg_last_op();
}
-static bool mb_tr_breakpoint_check(DisasContextBase *dcb, CPUState *cs,
- const CPUBreakpoint *bp)
+static int mb_tr_breakpoint_check(DisasContextBase *dcb, CPUState *cs,
+ int bp_flags)
{
DisasContext *dc = container_of(dcb, DisasContext, base);
gen_raise_exception_sync(dc, EXCP_DEBUG);
-
- /*
- * The address covered by the breakpoint must be included in
- * [tb->pc, tb->pc + tb->size) in order to for it to be
- * properly cleared -- thus we increment the PC here so that
- * the logic setting tb->size below does the right thing.
- */
- dc->base.pc_next += 4;
- return true;
+ return 4; /* minimum instruction length */
}
static void mb_tr_translate_insn(DisasContextBase *dcb, CPUState *cs)
diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
index 47c967acbf..c7b9d813c2 100644
--- a/target/mips/tcg/translate.c
+++ b/target/mips/tcg/translate.c
@@ -16190,22 +16190,16 @@ static void mips_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
ctx->btarget);
}
-static bool mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
- const CPUBreakpoint *bp)
+static int mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+ int bp_flags)
{
DisasContext *ctx = container_of(dcbase, DisasContext, base);
save_cpu_state(ctx, 1);
ctx->base.is_jmp = DISAS_NORETURN;
gen_helper_raise_exception_debug(cpu_env);
- /*
- * The address covered by the breakpoint must be included in
- * [tb->pc, tb->pc + tb->size) in order to for it to be
- * properly cleared -- thus we increment the PC here so that
- * the logic setting tb->size below does the right thing.
- */
- ctx->base.pc_next += 4;
- return true;
+
+ return 2; /* minimum instruction length */
}
static void mips_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/nios2/translate.c b/target/nios2/translate.c
index 17742cebc7..1d1c66b88f 100644
--- a/target/nios2/translate.c
+++ b/target/nios2/translate.c
@@ -777,20 +777,13 @@ static void nios2_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
tcg_gen_insn_start(dcbase->pc_next);
}
-static bool nios2_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
- const CPUBreakpoint *bp)
+static int nios2_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+ int bp_flags)
{
DisasContext *dc = container_of(dcbase, DisasContext, base);
gen_exception(dc, EXCP_DEBUG);
- /*
- * The address covered by the breakpoint must be included in
- * [tb->pc, tb->pc + tb->size) in order to for it to be
- * properly cleared -- thus we increment the PC here so that
- * the logic setting tb->size below does the right thing.
- */
- dc->base.pc_next += 4;
- return true;
+ return 4; /* minimum instruction length */
}
static void nios2_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/openrisc/translate.c b/target/openrisc/translate.c
index 37c3e3e0a3..adcf71645b 100644
--- a/target/openrisc/translate.c
+++ b/target/openrisc/translate.c
@@ -1639,20 +1639,15 @@ static void openrisc_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
| (dc->base.num_insns > 1 ? 2 : 0));
}
-static bool openrisc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
- const CPUBreakpoint *bp)
+static int openrisc_tr_breakpoint_check(DisasContextBase *dcbase,
+ CPUState *cs, int bp_flags)
{
DisasContext *dc = container_of(dcbase, DisasContext, base);
tcg_gen_movi_tl(cpu_pc, dc->base.pc_next);
gen_exception(dc, EXCP_DEBUG);
dc->base.is_jmp = DISAS_NORETURN;
- /* The address covered by the breakpoint must be included in
- [tb->pc, tb->pc + tb->size) in order to for it to be
- properly cleared -- thus we increment the PC here so that
- the logic setting tb->size below does the right thing. */
- dc->base.pc_next += 4;
- return true;
+ return 4; /* minimum instruction length */
}
static void openrisc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 0a55cb7181..5093be0694 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -8565,21 +8565,14 @@ static void ppc_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
tcg_gen_insn_start(dcbase->pc_next);
}
-static bool ppc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
- const CPUBreakpoint *bp)
+static int ppc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+ int bp_flags)
{
DisasContext *ctx = container_of(dcbase, DisasContext, base);
gen_update_nip(ctx, ctx->base.pc_next);
gen_debug_exception(ctx);
- /*
- * The address covered by the breakpoint must be included in
- * [tb->pc, tb->pc + tb->size) in order to for it to be properly
- * cleared -- thus we increment the PC here so that the logic
- * setting tb->size below does the right thing.
- */
- ctx->base.pc_next += 4;
- return true;
+ return 4; /* minimum instruction length */
}
static bool is_prefix_insn(DisasContext *ctx, uint32_t insn)
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index deda0c8a44..8a6bc58572 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -961,20 +961,15 @@ static void riscv_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
tcg_gen_insn_start(ctx->base.pc_next);
}
-static bool riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
- const CPUBreakpoint *bp)
+static int riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+ int bp_flags)
{
DisasContext *ctx = container_of(dcbase, DisasContext, base);
tcg_gen_movi_tl(cpu_pc, ctx->base.pc_next);
ctx->base.is_jmp = DISAS_NORETURN;
gen_exception_debug();
- /* The address covered by the breakpoint must be included in
- [tb->pc, tb->pc + tb->size) in order to for it to be
- properly cleared -- thus we increment the PC here so that
- the logic setting tb->size below does the right thing. */
- ctx->base.pc_next += 4;
- return true;
+ return 2; /* minimum instruction length */
}
static void riscv_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
diff --git a/target/rx/translate.c b/target/rx/translate.c
index 23a626438a..5e9950f3ac 100644
--- a/target/rx/translate.c
+++ b/target/rx/translate.c
@@ -2309,8 +2309,8 @@ static void rx_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
tcg_gen_insn_start(ctx->base.pc_next);
}
-static bool rx_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
- const CPUBreakpoint *bp)
+static int rx_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+ int bp_flags)
{
DisasContext *ctx = container_of(dcbase, DisasContext, base);
@@ -2318,8 +2318,8 @@ static bool rx_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
tcg_gen_movi_i32(cpu_pc, ctx->base.pc_next);
gen_helper_debug(cpu_env);
ctx->base.is_jmp = DISAS_NORETURN;
- ctx->base.pc_next += 1;
- return true;
+
+ return 1; /* minimum instruction length */
}
static void rx_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index c8d55d1f83..06f8fa96a6 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -6567,8 +6567,8 @@ static void s390x_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
{
}
-static bool s390x_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
- const CPUBreakpoint *bp)
+static int s390x_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+ int bp_flags)
{
DisasContext *dc = container_of(dcbase, DisasContext, base);
@@ -6582,12 +6582,8 @@ static bool s390x_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
dc->base.is_jmp = DISAS_PC_STALE;
dc->do_debug = true;
- /* The address covered by the breakpoint must be included in
- [tb->pc, tb->pc + tb->size) in order to for it to be
- properly cleared -- thus we increment the PC here so that
- the logic setting tb->size does the right thing. */
- dc->base.pc_next += 2;
- return true;
+
+ return 2; /* minimum instruction length */
}
static void s390x_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index 40898e2393..b1e19bf976 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -2289,8 +2289,8 @@ static void sh4_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
tcg_gen_insn_start(ctx->base.pc_next, ctx->envflags);
}
-static bool sh4_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
- const CPUBreakpoint *bp)
+static int sh4_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+ int bp_flags)
{
DisasContext *ctx = container_of(dcbase, DisasContext, base);
@@ -2298,12 +2298,8 @@ static bool sh4_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
gen_save_cpu_state(ctx, true);
gen_helper_debug(cpu_env);
ctx->base.is_jmp = DISAS_NORETURN;
- /* The address covered by the breakpoint must be included in
- [tb->pc, tb->pc + tb->size) in order to for it to be
- properly cleared -- thus we increment the PC here so that
- the logic setting tb->size below does the right thing. */
- ctx->base.pc_next += 2;
- return true;
+
+ return 2; /* minimum instruction length */
}
static void sh4_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/sparc/translate.c b/target/sparc/translate.c
index e530cb4aa8..d6b554cefe 100644
--- a/target/sparc/translate.c
+++ b/target/sparc/translate.c
@@ -5854,8 +5854,8 @@ static void sparc_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
}
}
-static bool sparc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
- const CPUBreakpoint *bp)
+static int sparc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
+ int bp_flags)
{
DisasContext *dc = container_of(dcbase, DisasContext, base);
@@ -5865,9 +5865,8 @@ static bool sparc_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
gen_helper_debug(cpu_env);
tcg_gen_exit_tb(NULL, 0);
dc->base.is_jmp = DISAS_NORETURN;
- /* update pc_next so that the current instruction is included in tb->size */
- dc->base.pc_next += 4;
- return true;
+
+ return 4; /* minimum instruction length */
}
static void sparc_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
diff --git a/target/tricore/translate.c b/target/tricore/translate.c
index 865020754d..8c39134d52 100644
--- a/target/tricore/translate.c
+++ b/target/tricore/translate.c
@@ -8810,19 +8810,12 @@ static void tricore_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
tcg_gen_insn_start(ctx->base.pc_next);
}
-static bool tricore_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
- const CPUBreakpoint *bp)
+static int tricore_tr_breakpoint_check(DisasContextBase *dcbase,
+ CPUState *cpu, int bp_flags)
{
DisasContext *ctx = container_of(dcbase, DisasContext, base);
generate_qemu_excp(ctx, EXCP_DEBUG);
- /*
- * The address covered by the breakpoint must be included in
- * [tb->pc, tb->pc + tb->size) in order to for it to be
- * properly cleared -- thus we increment the PC here so that
- * the logic setting tb->size below does the right thing.
- */
- ctx->base.pc_next += 4;
- return true;
+ return 4; /* minimum instruction length */
}
static bool insn_crosses_page(CPUTriCoreState *env, DisasContext *ctx)
diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index 7094cfcf1d..6e7ad266f4 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -1232,20 +1232,16 @@ static void xtensa_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
tcg_gen_insn_start(dcbase->pc_next);
}
-static bool xtensa_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
- const CPUBreakpoint *bp)
+static int xtensa_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
+ int bp_flags)
{
DisasContext *dc = container_of(dcbase, DisasContext, base);
tcg_gen_movi_i32(cpu_pc, dc->base.pc_next);
gen_exception(dc, EXCP_DEBUG);
dc->base.is_jmp = DISAS_NORETURN;
- /* The address covered by the breakpoint must be included in
- [tb->pc, tb->pc + tb->size) in order to for it to be
- properly cleared -- thus we increment the PC here so that
- the logic setting tb->size below does the right thing. */
- dc->base.pc_next += 2;
- return true;
+
+ return 2; /* minimum instruction length */
}
static void xtensa_tr_translate_insn(DisasContextBase *dcbase, CPUState *cpu)
--
2.25.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v2 09/10] accel/tcg: Hoist tb_cflags to a local in translator_loop
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
` (7 preceding siblings ...)
2021-07-12 15:40 ` [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check Richard Henderson
@ 2021-07-12 15:40 ` Richard Henderson
2021-07-17 17:50 ` Alex Bennée
2021-07-12 15:40 ` [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags Richard Henderson
` (2 subsequent siblings)
11 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:40 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug
The access internal to tb_cflags() is atomic.
Avoid re-reading it as such for the multiple uses.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
accel/tcg/translator.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 1c44d096d8..449159a27c 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -50,6 +50,7 @@ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
CPUState *cpu, TranslationBlock *tb, int max_insns)
{
+ uint32_t cflags = tb_cflags(tb);
bool plugin_enabled;
/* Initialize DisasContext */
@@ -72,8 +73,7 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
ops->tb_start(db, cpu);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
- plugin_enabled = plugin_gen_tb_start(cpu, tb,
- tb_cflags(db->tb) & CF_MEMI_ONLY);
+ plugin_enabled = plugin_gen_tb_start(cpu, tb, cflags & CF_MEMI_ONLY);
while (true) {
db->num_insns++;
@@ -125,14 +125,13 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
update db->pc_next and db->is_jmp to indicate what should be
done next -- either exiting this loop or locate the start of
the next instruction. */
- if (db->num_insns == db->max_insns
- && (tb_cflags(db->tb) & CF_LAST_IO)) {
+ if (db->num_insns == db->max_insns && (cflags & CF_LAST_IO)) {
/* Accept I/O on the last instruction. */
gen_io_start();
ops->translate_insn(db, cpu);
} else {
/* we should only see CF_MEMI_ONLY for io_recompile */
- tcg_debug_assert(!(tb_cflags(db->tb) & CF_MEMI_ONLY));
+ tcg_debug_assert(!(cflags & CF_MEMI_ONLY));
ops->translate_insn(db, cpu);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
` (8 preceding siblings ...)
2021-07-12 15:40 ` [PATCH v2 09/10] accel/tcg: Hoist tb_cflags to a local in translator_loop Richard Henderson
@ 2021-07-12 15:40 ` Richard Henderson
2021-07-17 17:58 ` Peter Maydell
2021-07-12 18:20 ` [PATCH v2 00/10] tcg: breakpoint reorg Mark Cave-Ayland
2021-07-17 17:16 ` Richard Henderson
11 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2021-07-12 15:40 UTC (permalink / raw)
To: qemu-devel; +Cc: mark.cave-ayland, alex.bennee, f4bug
Having this data in cflags means that hashing takes care
of selecting a TB with or without exceptions built in.
Which means that we no longer need to flush all TBs.
This does require that we single-step while we're within a page
that contains a breakpoint, so it's not yet ideal, but should be
an improvement over some corner-case slowdowns.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/404
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
include/exec/exec-all.h | 7 ++++
accel/tcg/cpu-exec.c | 68 ++++++++++++++++++++++++++++++-
accel/tcg/translate-all.c | 4 --
accel/tcg/translator.c | 85 +++++++++++++++++++++------------------
cpu.c | 24 -----------
5 files changed, 119 insertions(+), 69 deletions(-)
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 6873cce8df..7ab2578f71 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -502,9 +502,16 @@ struct TranslationBlock {
#define CF_USE_ICOUNT 0x00020000
#define CF_INVALID 0x00040000 /* TB is stale. Set with @jmp_lock held */
#define CF_PARALLEL 0x00080000 /* Generate code for a parallel context */
+#define CF_BP_MASK 0x00300000 /* See below */
+#define CF_BP_SHIFT 20
#define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */
#define CF_CLUSTER_SHIFT 24
+#define CF_BP_NONE (0 << CF_BP_SHIFT) /* TB does not interact with BPs */
+#define CF_BP_SSTEP (1 << CF_BP_SHIFT) /* gdbstub single-step in effect */
+#define CF_BP_GDB (2 << CF_BP_SHIFT) /* gdbstub breakpoint at tb->pc */
+#define CF_BP_CPU (3 << CF_BP_SHIFT) /* arch breakpoint at tb->pc */
+
/* Per-vCPU dynamic tracing state used to generate this TB */
uint32_t trace_vcpu_dstate;
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 4d043a11aa..179a425ece 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -222,6 +222,65 @@ static inline void log_cpu_exec(target_ulong pc, CPUState *cpu,
}
}
+static uint32_t cflags_for_breakpoints(CPUState *cpu, target_ulong pc,
+ uint32_t cflags)
+{
+ uint32_t bflags = 0;
+
+ if (unlikely(cpu->singlestep_enabled)) {
+ bflags = CF_BP_SSTEP;
+ } else {
+ bool match_page = false;
+ CPUBreakpoint *bp;
+
+ QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) {
+ /* Note exact pc matches */
+ if (pc == bp->pc) {
+ if (bp->flags & BP_GDB) {
+ bflags = CF_BP_GDB;
+ break;
+ }
+ if (bp->flags & BP_CPU) {
+ bflags = CF_BP_CPU;
+ break;
+ }
+ }
+ /* Note page matches */
+ if (((pc ^ bp->pc) & TARGET_PAGE_MASK) == 0) {
+ match_page = true;
+ }
+ }
+
+ if (likely(!bflags)) {
+ if (likely(!match_page)) {
+ return cflags;
+ }
+
+ /*
+ * Within the same page as a breakpoint, single-step,
+ * returning to helper_lookup_tb_ptr after each looking
+ * for the actual breakpoint.
+ *
+ * TODO: Perhaps better to record all of the TBs associated
+ * with a given virtual page that contains a breakpoint, and
+ * then invalidate them when a new overlapping breakpoint is
+ * set on the page. Non-overlapping TBs would not be
+ * invalidated, nor would any TB need to be invalidated as
+ * breakpoints are removed.
+ */
+ bflags = CF_NO_GOTO_TB;
+ }
+ }
+
+ /*
+ * Reduce the TB to one insn.
+ * In the case of a BP hit, we will be raising an exception anyway.
+ * In the case of a page hit, return to helper_lookup_tb_ptr after
+ * every insn to look for the breakpoint.
+ */
+ return (cflags & ~CF_COUNT_MASK) | 1 | bflags;
+}
+
/**
* helper_lookup_tb_ptr: quick check for next tb
* @env: current cpu state
@@ -235,11 +294,13 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
CPUState *cpu = env_cpu(env);
TranslationBlock *tb;
target_ulong cs_base, pc;
- uint32_t flags;
+ uint32_t flags, cflags;
cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
- tb = tb_lookup(cpu, pc, cs_base, flags, curr_cflags(cpu));
+ cflags = cflags_for_breakpoints(cpu, pc, curr_cflags(cpu));
+
+ tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
if (tb == NULL) {
return tcg_code_gen_epilogue;
}
@@ -346,6 +407,8 @@ void cpu_exec_step_atomic(CPUState *cpu)
cflags &= ~CF_PARALLEL;
/* After 1 insn, return and release the exclusive lock. */
cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR | 1;
+ /* Raise any post-instruction debug exceptions. */
+ cflags = cflags_for_breakpoints(cpu, pc, cflags);
tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
if (tb == NULL) {
@@ -524,6 +587,7 @@ static inline TranslationBlock *tb_find(CPUState *cpu,
} else {
cpu->cflags_next_tb = -1;
}
+ cflags = cflags_for_breakpoints(cpu, pc, cflags);
tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
if (tb == NULL) {
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 491c1a56b2..1f98078608 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1432,10 +1432,6 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
}
QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
- if (cpu->singlestep_enabled) {
- max_insns = 1;
- }
-
buffer_overflow:
tb = tcg_tb_alloc(tcg_ctx);
if (unlikely(!tb)) {
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 449159a27c..01b48137da 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -33,13 +33,8 @@ void translator_loop_temp_check(DisasContextBase *db)
bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
{
- /* Suppress goto_tb if requested. */
- if (tb_cflags(db->tb) & CF_NO_GOTO_TB) {
- return false;
- }
-
- /* Suppress goto_tb in the case of single-steping. */
- if (db->singlestep_enabled) {
+ /* Suppress goto_tb if requested, or required by breakpoints. */
+ if (tb_cflags(db->tb) & (CF_NO_GOTO_TB | CF_BP_MASK)) {
return false;
}
@@ -51,6 +46,7 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
CPUState *cpu, TranslationBlock *tb, int max_insns)
{
uint32_t cflags = tb_cflags(tb);
+ int bp_flags = 0;
bool plugin_enabled;
/* Initialize DisasContext */
@@ -60,7 +56,23 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
db->is_jmp = DISAS_NEXT;
db->num_insns = 0;
db->max_insns = max_insns;
- db->singlestep_enabled = cpu->singlestep_enabled;
+ db->singlestep_enabled = false;
+
+ switch (cflags & CF_BP_MASK) {
+ case CF_BP_NONE:
+ break;
+ case CF_BP_SSTEP:
+ db->singlestep_enabled = true;
+ break;
+ case CF_BP_GDB:
+ bp_flags = BP_GDB;
+ break;
+ case CF_BP_CPU:
+ bp_flags = BP_CPU;
+ break;
+ default:
+ g_assert_not_reached();
+ }
ops->init_disas_context(db, cpu);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
@@ -85,39 +97,34 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
}
/* Pass breakpoint hits to target for further processing */
- if (!db->singlestep_enabled
- && unlikely(!QTAILQ_EMPTY(&cpu->breakpoints))) {
- CPUBreakpoint *bp;
- QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) {
- if (bp->pc == db->pc_next) {
- int len = ops->breakpoint_check(db, cpu, bp->flags);
+ if (unlikely(bp_flags)) {
+ int len = ops->breakpoint_check(db, cpu, bp_flags);
- /*
- * The breakpoint_check hook may use DISAS_TOO_MANY
- * to indicate that only one more instruction is to
- * be executed. Otherwise it should use DISAS_NORETURN
- * when generating an exception, but may use a
- * DISAS_TARGET_* value for Something Else.
- */
- if (db->is_jmp > DISAS_TOO_MANY) {
- /*
- * The address covered by the breakpoint must be
- * included in [tb->pc, tb->pc + tb->size) in order
- * to for it to be properly cleared. Thus we
- * increment the PC here so that the logic setting
- * tb->size below does the right thing.
- */
- tcg_debug_assert(len > 0);
- db->pc_next += len;
+ /*
+ * When there is a bp hit, we're going to execute a maximum
+ * of one instruction. The breakpoint_check hook may use
+ * DISAS_NEXT or DISAS_TOO_MANY to indicate that the current
+ * instruction should be translated. Anything else is taken
+ * to mean that an exception has been raised and that zero
+ * instructions will be executed.
+ */
+ if (db->is_jmp > DISAS_TOO_MANY) {
+ /*
+ * The address covered by the breakpoint must be
+ * included in [tb->pc, tb->pc + tb->size) in order
+ * to for it to be properly cleared. Thus we
+ * increment the PC here so that the logic setting
+ * tb->size below does the right thing.
+ */
+ tcg_debug_assert(len > 0);
+ db->pc_next += len;
- /*
- * The breakpoint definitely hit, so decrement the
- * number of instructions completed for icount.
- */
- db->num_insns--;
- goto done;
- }
- }
+ /*
+ * The breakpoint definitely hit, so decrement the
+ * number of instructions completed for icount.
+ */
+ db->num_insns--;
+ goto done;
}
}
diff --git a/cpu.c b/cpu.c
index 83059537d7..addcb5db9c 100644
--- a/cpu.c
+++ b/cpu.c
@@ -225,11 +225,6 @@ void tb_invalidate_phys_addr(target_ulong addr)
tb_invalidate_phys_page_range(addr, addr + 1);
mmap_unlock();
}
-
-static void breakpoint_invalidate(CPUState *cpu, target_ulong pc)
-{
- tb_invalidate_phys_addr(pc);
-}
#else
void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs)
{
@@ -250,17 +245,6 @@ void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs)
ram_addr = memory_region_get_ram_addr(mr) + addr;
tb_invalidate_phys_page_range(ram_addr, ram_addr + 1);
}
-
-static void breakpoint_invalidate(CPUState *cpu, target_ulong pc)
-{
- /*
- * There may not be a virtual to physical translation for the pc
- * right now, but there may exist cached TB for this pc.
- * Flush the whole TB cache to force re-translation of such TBs.
- * This is heavyweight, but we're debugging anyway.
- */
- tb_flush(cpu);
-}
#endif
/* Add a breakpoint. */
@@ -281,8 +265,6 @@ int cpu_breakpoint_insert(CPUState *cpu, vaddr pc, int flags,
QTAILQ_INSERT_TAIL(&cpu->breakpoints, bp, entry);
}
- breakpoint_invalidate(cpu, pc);
-
if (breakpoint) {
*breakpoint = bp;
}
@@ -310,8 +292,6 @@ void cpu_breakpoint_remove_by_ref(CPUState *cpu, CPUBreakpoint *bp)
{
QTAILQ_REMOVE(&cpu->breakpoints, bp, entry);
- breakpoint_invalidate(cpu, bp->pc);
-
trace_breakpoint_remove(cpu->cpu_index, bp->pc, bp->flags);
g_free(bp);
}
@@ -336,10 +316,6 @@ void cpu_single_step(CPUState *cpu, int enabled)
cpu->singlestep_enabled = enabled;
if (kvm_enabled()) {
kvm_update_guest_debug(cpu, 0);
- } else {
- /* must flush all the translated code to avoid inconsistencies */
- /* XXX: only flush what is necessary */
- tb_flush(cpu);
}
trace_breakpoint_singlestep(cpu->cpu_index, enabled);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [PATCH v2 00/10] tcg: breakpoint reorg
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
` (9 preceding siblings ...)
2021-07-12 15:40 ` [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags Richard Henderson
@ 2021-07-12 18:20 ` Mark Cave-Ayland
2021-07-17 17:16 ` Richard Henderson
11 siblings, 0 replies; 37+ messages in thread
From: Mark Cave-Ayland @ 2021-07-12 18:20 UTC (permalink / raw)
To: Richard Henderson, qemu-devel; +Cc: alex.bennee, f4bug
On 12/07/2021 16:39, Richard Henderson wrote:
> This is fixing #404 ("windows xp boot takes much longer...")
> and several other similar reports.
>
> For v2, all prerequisites and 7 of the patches from v1 with
> reviews are now upstream.
>
> Mark Cave-Ayland reported success with WinXP with v1, with
> this patch set being even faster than b55f54bc~1. Which was
> a bit of a surprise, but I'll take it. It means that it's
> probably not worth making the breakpoint detection scheme
> any more complicated.
>
> I'd still like some more feedback. Given this is fixing a
> regression from qemu 5.2 I feel comfortable delaying this
> past soft freeze, but not past hard freeze on the 20th.
>
>
> r~
>
>
> Richard Henderson (10):
> accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
> accel/tcg: Move curr_cflags into cpu-exec.c
> accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
> accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
> accel/tcg: Handle -singlestep in curr_cflags
> accel/tcg: Use CF_NO_GOTO_{TB,PTR} in cpu_exec_step_atomic
> accel/tcg: Move cflags lookup into tb_find
> accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
> accel/tcg: Hoist tb_cflags to a local in translator_loop
> accel/tcg: Encode breakpoint info into tb->cflags
>
> include/exec/exec-all.h | 30 +++++---
> include/exec/translator.h | 17 +++--
> accel/tcg/cpu-exec.c | 130 ++++++++++++++++++++++++++++------
> accel/tcg/translate-all.c | 7 +-
> accel/tcg/translator.c | 79 ++++++++++++++-------
> cpu.c | 24 -------
> target/alpha/translate.c | 12 +---
> target/arm/translate-a64.c | 14 ++--
> target/arm/translate.c | 20 +++---
> target/avr/translate.c | 6 +-
> target/cris/translate.c | 14 ++--
> target/hexagon/translate.c | 13 +---
> target/hppa/translate.c | 7 +-
> target/i386/tcg/translate.c | 15 ++--
> target/m68k/translate.c | 14 +---
> target/microblaze/translate.c | 14 +---
> target/mips/tcg/translate.c | 14 ++--
> target/nios2/translate.c | 13 +---
> target/openrisc/translate.c | 11 +--
> target/ppc/translate.c | 13 +---
> target/riscv/translate.c | 11 +--
> target/rx/translate.c | 8 +--
> target/s390x/translate.c | 12 ++--
> target/sh4/translate.c | 12 ++--
> target/sparc/translate.c | 9 ++-
> target/tricore/translate.c | 13 +---
> target/xtensa/translate.c | 12 ++--
> tcg/tcg-op.c | 28 ++++----
> 28 files changed, 280 insertions(+), 292 deletions(-)
FWIW I've just tested this v2 patchset on top of git master (bd38ae26ce) and I still
see the same improvement i.e. WinXP boot to the login screen goes down from 2m 38s to
25s so:
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
ATB,
Mark.
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 00/10] tcg: breakpoint reorg
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
` (10 preceding siblings ...)
2021-07-12 18:20 ` [PATCH v2 00/10] tcg: breakpoint reorg Mark Cave-Ayland
@ 2021-07-17 17:16 ` Richard Henderson
11 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 17:16 UTC (permalink / raw)
To: qemu-devel, Peter Maydell; +Cc: mark.cave-ayland, alex.bennee, f4bug
Ping. A Tested-by is nice, but Reviewed-by is better,
and time is running out, even for bug fixes.
r~
On 7/12/21 8:39 AM, Richard Henderson wrote:
> This is fixing #404 ("windows xp boot takes much longer...")
> and several other similar reports.
>
> For v2, all prerequisites and 7 of the patches from v1 with
> reviews are now upstream.
>
> Mark Cave-Ayland reported success with WinXP with v1, with
> this patch set being even faster than b55f54bc~1. Which was
> a bit of a surprise, but I'll take it. It means that it's
> probably not worth making the breakpoint detection scheme
> any more complicated.
>
> I'd still like some more feedback. Given this is fixing a
> regression from qemu 5.2 I feel comfortable delaying this
> past soft freeze, but not past hard freeze on the 20th.
>
>
> r~
>
>
> Richard Henderson (10):
> accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
> accel/tcg: Move curr_cflags into cpu-exec.c
> accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
> accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
> accel/tcg: Handle -singlestep in curr_cflags
> accel/tcg: Use CF_NO_GOTO_{TB,PTR} in cpu_exec_step_atomic
> accel/tcg: Move cflags lookup into tb_find
> accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
> accel/tcg: Hoist tb_cflags to a local in translator_loop
> accel/tcg: Encode breakpoint info into tb->cflags
>
> include/exec/exec-all.h | 30 +++++---
> include/exec/translator.h | 17 +++--
> accel/tcg/cpu-exec.c | 130 ++++++++++++++++++++++++++++------
> accel/tcg/translate-all.c | 7 +-
> accel/tcg/translator.c | 79 ++++++++++++++-------
> cpu.c | 24 -------
> target/alpha/translate.c | 12 +---
> target/arm/translate-a64.c | 14 ++--
> target/arm/translate.c | 20 +++---
> target/avr/translate.c | 6 +-
> target/cris/translate.c | 14 ++--
> target/hexagon/translate.c | 13 +---
> target/hppa/translate.c | 7 +-
> target/i386/tcg/translate.c | 15 ++--
> target/m68k/translate.c | 14 +---
> target/microblaze/translate.c | 14 +---
> target/mips/tcg/translate.c | 14 ++--
> target/nios2/translate.c | 13 +---
> target/openrisc/translate.c | 11 +--
> target/ppc/translate.c | 13 +---
> target/riscv/translate.c | 11 +--
> target/rx/translate.c | 8 +--
> target/s390x/translate.c | 12 ++--
> target/sh4/translate.c | 12 ++--
> target/sparc/translate.c | 9 ++-
> target/tricore/translate.c | 13 +---
> target/xtensa/translate.c | 12 ++--
> tcg/tcg-op.c | 28 ++++----
> 28 files changed, 280 insertions(+), 292 deletions(-)
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
2021-07-12 15:39 ` [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
@ 2021-07-17 17:30 ` Peter Maydell
2021-07-17 18:31 ` Richard Henderson
2021-07-17 17:33 ` Alex Bennée
1 sibling, 1 reply; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:30 UTC (permalink / raw)
To: Richard Henderson
Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
Philippe Mathieu-Daudé
On Mon, 12 Jul 2021 at 16:42, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The space reserved for CF_COUNT_MASK was overly large.
> Reduce to free up cflags bits and eliminate an extra test.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> include/exec/exec-all.h | 4 +++-
> accel/tcg/translate-all.c | 5 ++---
> 2 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 754f4130c9..dfe82ed19c 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -492,7 +492,9 @@ struct TranslationBlock {
> target_ulong cs_base; /* CS base for this block */
> uint32_t flags; /* flags defining in which context the code was generated */
> uint32_t cflags; /* compile flags */
> -#define CF_COUNT_MASK 0x00007fff
> +
> +/* Note that TCG_MAX_INSNS is 512; we validate this match elsewhere. */
> +#define CF_COUNT_MASK 0x000001ff
> #define CF_LAST_IO 0x00008000 /* Last insn may be an IO access. */
> #define CF_MEMI_ONLY 0x00010000 /* Only instrument memory ops */
> #define CF_USE_ICOUNT 0x00020000
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index 4df26de858..997e44c73b 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -1430,9 +1430,8 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
> if (max_insns == 0) {
> max_insns = CF_COUNT_MASK;
> }
> - if (max_insns > TCG_MAX_INSNS) {
> - max_insns = TCG_MAX_INSNS;
> - }
> + QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
> +
Previously we would allow max_insns = TCG_MAX_INSNS (512).
Now we only allow it to be 511...
-- PMM
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c
2021-07-12 15:39 ` [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c Richard Henderson
@ 2021-07-17 17:32 ` Peter Maydell
2021-07-17 17:34 ` Alex Bennée
1 sibling, 0 replies; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:32 UTC (permalink / raw)
To: Richard Henderson
Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
Philippe Mathieu-Daudé
On Mon, 12 Jul 2021 at 16:44, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> We will shortly have more than a simple member read here,
> with stuff not necessarily exposed to exec/exec-all.h.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
(apologies if you saw a no-content email: gmail burp)
thanks
-- PMM
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
2021-07-12 15:39 ` [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
2021-07-17 17:30 ` Peter Maydell
@ 2021-07-17 17:33 ` Alex Bennée
1 sibling, 0 replies; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:33 UTC (permalink / raw)
To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug
Richard Henderson <richard.henderson@linaro.org> writes:
> The space reserved for CF_COUNT_MASK was overly large.
> Reduce to free up cflags bits and eliminate an extra test.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
--
Alex Bennée
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c
2021-07-12 15:39 ` [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c Richard Henderson
2021-07-17 17:32 ` Peter Maydell
@ 2021-07-17 17:34 ` Alex Bennée
1 sibling, 0 replies; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:34 UTC (permalink / raw)
To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug
Richard Henderson <richard.henderson@linaro.org> writes:
> We will shortly have more than a simple member read here,
> with stuff not necessarily exposed to exec/exec-all.h.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
--
Alex Bennée
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
2021-07-12 15:39 ` [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR Richard Henderson
@ 2021-07-17 17:36 ` Alex Bennée
2021-07-17 17:36 ` Peter Maydell
1 sibling, 0 replies; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:36 UTC (permalink / raw)
To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug
Richard Henderson <richard.henderson@linaro.org> writes:
> Move the -d nochain check to bits on tb->cflags.
> These will be used for more than -d nochain shortly.
>
> Set bits during curr_cflags, test them in translator_use_goto_tb,
> assert we're not doing anything odd in tcg_gen_goto_tb. The test
> in tcg_gen_exit_tb is redundant with the assert for goto_tb_issue_mask.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
--
Alex Bennée
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
2021-07-12 15:39 ` [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR Richard Henderson
2021-07-17 17:36 ` Alex Bennée
@ 2021-07-17 17:36 ` Peter Maydell
1 sibling, 0 replies; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:36 UTC (permalink / raw)
To: Richard Henderson
Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
Philippe Mathieu-Daudé
On Mon, 12 Jul 2021 at 16:44, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Move the -d nochain check to bits on tb->cflags.
> These will be used for more than -d nochain shortly.
>
> Set bits during curr_cflags, test them in translator_use_goto_tb,
> assert we're not doing anything odd in tcg_gen_goto_tb. The test
> in tcg_gen_exit_tb is redundant with the assert for goto_tb_issue_mask.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
2021-07-12 15:39 ` [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain Richard Henderson
@ 2021-07-17 17:39 ` Peter Maydell
2021-07-17 17:39 ` Alex Bennée
1 sibling, 0 replies; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:39 UTC (permalink / raw)
To: Richard Henderson
Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
Philippe Mathieu-Daudé
On Mon, 12 Jul 2021 at 16:44, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> The purpose of suppressing goto_ptr from -d nochain had been
> to return to the main loop so that -d cpu would be recognized.
> But we now include -d cpu logging in helper_lookup_tb_ptr so
> there is no need to exclude goto_ptr.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/cpu-exec.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index d3232d5764..70ea3c7d68 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -150,7 +150,7 @@ uint32_t curr_cflags(CPUState *cpu)
> uint32_t cflags = cpu->tcg_cflags;
>
> if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
> - cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
> + cflags |= CF_NO_GOTO_TB;
> }
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
2021-07-12 15:39 ` [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain Richard Henderson
2021-07-17 17:39 ` Peter Maydell
@ 2021-07-17 17:39 ` Alex Bennée
2021-07-17 18:38 ` Richard Henderson
1 sibling, 1 reply; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:39 UTC (permalink / raw)
To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug
Richard Henderson <richard.henderson@linaro.org> writes:
> The purpose of suppressing goto_ptr from -d nochain had been
> to return to the main loop so that -d cpu would be recognized.
Hmm is it though? I've always treated it as ensuring we always come out
into the main loop (which is helpful for debugging).
> But we now include -d cpu logging in helper_lookup_tb_ptr so
> there is no need to exclude goto_ptr.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/cpu-exec.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index d3232d5764..70ea3c7d68 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -150,7 +150,7 @@ uint32_t curr_cflags(CPUState *cpu)
> uint32_t cflags = cpu->tcg_cflags;
>
> if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
> - cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
> + cflags |= CF_NO_GOTO_TB;
> }
>
> return cflags;
--
Alex Bennée
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags
2021-07-12 15:39 ` [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags Richard Henderson
@ 2021-07-17 17:41 ` Peter Maydell
2021-07-17 17:42 ` Alex Bennée
1 sibling, 0 replies; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:41 UTC (permalink / raw)
To: Richard Henderson
Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
Philippe Mathieu-Daudé
On Mon, 12 Jul 2021 at 16:42, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Exchange the test in translator_use_goto_tb for CF_NO_GOTO_TB,
> and the test in tb_gen_code for setting CF_COUNT_MASK to 1.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/cpu-exec.c | 8 +++++++-
> accel/tcg/translate-all.c | 2 +-
> accel/tcg/translator.c | 2 +-
> 3 files changed, 9 insertions(+), 3 deletions(-)
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags
2021-07-12 15:39 ` [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags Richard Henderson
2021-07-17 17:41 ` Peter Maydell
@ 2021-07-17 17:42 ` Alex Bennée
2021-07-17 18:42 ` Richard Henderson
1 sibling, 1 reply; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:42 UTC (permalink / raw)
To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug
Richard Henderson <richard.henderson@linaro.org> writes:
> Exchange the test in translator_use_goto_tb for CF_NO_GOTO_TB,
> and the test in tb_gen_code for setting CF_COUNT_MASK to 1.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/cpu-exec.c | 8 +++++++-
> accel/tcg/translate-all.c | 2 +-
> accel/tcg/translator.c | 2 +-
> 3 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 70ea3c7d68..2206c463f5 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -149,7 +149,13 @@ uint32_t curr_cflags(CPUState *cpu)
> {
> uint32_t cflags = cpu->tcg_cflags;
>
> - if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
> + /*
> + * For singlestep and -d nochain, suppress goto_tb so that
> + * we can log -d cpu,exec after every TB.
> + */
> + if (singlestep) {
Hmm we are testing a magic global here and looking at
cpu->singlestep_enabled lower down. We have a transient singlestep which
is turned on an off via cpu->singlestep_enabled and a global as a debug
option. Can we rationalise it further?
> + cflags |= CF_NO_GOTO_TB | 1;
> + } else if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
> cflags |= CF_NO_GOTO_TB;
> }
>
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index 997e44c73b..491c1a56b2 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -1432,7 +1432,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
> }
> QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
>
> - if (cpu->singlestep_enabled || singlestep) {
> + if (cpu->singlestep_enabled) {
> max_insns = 1;
> }
>
> diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
> index 2ea5a74f30..a59eb7c11b 100644
> --- a/accel/tcg/translator.c
> +++ b/accel/tcg/translator.c
> @@ -39,7 +39,7 @@ bool translator_use_goto_tb(DisasContextBase *db, target_ulong dest)
> }
>
> /* Suppress goto_tb in the case of single-steping. */
> - if (db->singlestep_enabled || singlestep) {
> + if (db->singlestep_enabled) {
> return false;
> }
--
Alex Bennée
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic
2021-07-12 15:40 ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic Richard Henderson
@ 2021-07-17 17:43 ` Peter Maydell
2021-07-17 19:03 ` Richard Henderson
2021-07-17 17:47 ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB,PTR} " Alex Bennée
1 sibling, 1 reply; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:43 UTC (permalink / raw)
To: Richard Henderson
Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
Philippe Mathieu-Daudé
On Mon, 12 Jul 2021 at 16:46, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Request that the one TB returns immediately, so that
> we release the exclusive lock as soon as possible.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/cpu-exec.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 2206c463f5..5bb099174f 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -330,8 +330,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
> CPUArchState *env = (CPUArchState *)cpu->env_ptr;
> TranslationBlock *tb;
> target_ulong cs_base, pc;
> - uint32_t flags;
> - uint32_t cflags = (curr_cflags(cpu) & ~CF_PARALLEL) | 1;
> + uint32_t flags, cflags;
> int tb_exit;
>
> if (sigsetjmp(cpu->jmp_env, 0) == 0) {
> @@ -341,8 +340,14 @@ void cpu_exec_step_atomic(CPUState *cpu)
> cpu->running = true;
>
> cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
> - tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
>
> + cflags = curr_cflags(cpu);
> + /* Execute in a serial context. */
> + cflags &= ~CF_PARALLEL;
> + /* After 1 insn, return and release the exclusive lock. */
> + cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR | 1;
> +
> + tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
> if (tb == NULL) {
> mmap_lock();
> tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
So previously we would have executed possibly a chain of TBs
before releasing the lock, and now we definitely execute just one?
(I guess the execute-a-chain case is unlikely given the TB
only has one insn and we know it's an exclusive insn...)
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 07/10] accel/tcg: Move cflags lookup into tb_find
2021-07-12 15:40 ` [PATCH v2 07/10] accel/tcg: Move cflags lookup into tb_find Richard Henderson
@ 2021-07-17 17:45 ` Peter Maydell
0 siblings, 0 replies; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:45 UTC (permalink / raw)
To: Richard Henderson
Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
Philippe Mathieu-Daudé
On Mon, 12 Jul 2021 at 16:42, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> We will shortly require the guest pc for computing cflags,
> so move the choice just after cpu_get_tb_cpu_state.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> accel/tcg/cpu-exec.c | 34 +++++++++++++++++-----------------
> 1 file changed, 17 insertions(+), 17 deletions(-)
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB,PTR} in cpu_exec_step_atomic
2021-07-12 15:40 ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic Richard Henderson
2021-07-17 17:43 ` Peter Maydell
@ 2021-07-17 17:47 ` Alex Bennée
1 sibling, 0 replies; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:47 UTC (permalink / raw)
To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug
Richard Henderson <richard.henderson@linaro.org> writes:
> Request that the one TB returns immediately, so that
> we release the exclusive lock as soon as possible.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
--
Alex Bennée
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 09/10] accel/tcg: Hoist tb_cflags to a local in translator_loop
2021-07-12 15:40 ` [PATCH v2 09/10] accel/tcg: Hoist tb_cflags to a local in translator_loop Richard Henderson
@ 2021-07-17 17:50 ` Alex Bennée
0 siblings, 0 replies; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 17:50 UTC (permalink / raw)
To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug
Richard Henderson <richard.henderson@linaro.org> writes:
> The access internal to tb_cflags() is atomic.
> Avoid re-reading it as such for the multiple uses.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
--
Alex Bennée
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
2021-07-12 15:40 ` [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check Richard Henderson
@ 2021-07-17 17:52 ` Peter Maydell
2021-07-17 19:41 ` Richard Henderson
0 siblings, 1 reply; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:52 UTC (permalink / raw)
To: Richard Henderson
Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
Philippe Mathieu-Daudé
On Mon, 12 Jul 2021 at 16:48, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> We don't need the whole CPUBreakpoint structure in the check,
> only the flags. Return the instruction length to consolidate
> the adjustment of db->pc_next.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> diff --git a/target/avr/translate.c b/target/avr/translate.c
> index 8237a03c23..73ff467926 100644
> --- a/target/avr/translate.c
> +++ b/target/avr/translate.c
> @@ -2944,13 +2944,13 @@ static void avr_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
> tcg_gen_insn_start(ctx->npc);
> }
>
> -static bool avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
> - const CPUBreakpoint *bp)
> +static int avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
> + int bp_flags)
> {
> DisasContext *ctx = container_of(dcbase, DisasContext, base);
>
> gen_breakpoint(ctx);
> - return true;
> + return 2; /* minimum instruction length */
Here we weren't advancing pc_next at all, and now we do.
> diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
> index 47c967acbf..c7b9d813c2 100644
> --- a/target/mips/tcg/translate.c
> +++ b/target/mips/tcg/translate.c
> @@ -16190,22 +16190,16 @@ static void mips_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
> ctx->btarget);
> }
>
> -static bool mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
> - const CPUBreakpoint *bp)
> +static int mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
> + int bp_flags)
> {
> DisasContext *ctx = container_of(dcbase, DisasContext, base);
>
> save_cpu_state(ctx, 1);
> ctx->base.is_jmp = DISAS_NORETURN;
> gen_helper_raise_exception_debug(cpu_env);
> - /*
> - * The address covered by the breakpoint must be included in
> - * [tb->pc, tb->pc + tb->size) in order to for it to be
> - * properly cleared -- thus we increment the PC here so that
> - * the logic setting tb->size below does the right thing.
> - */
> - ctx->base.pc_next += 4;
> - return true;
> +
> + return 2; /* minimum instruction length */
> }
Here we were advancing by 4 and now advance by 2.
> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
> index deda0c8a44..8a6bc58572 100644
> --- a/target/riscv/translate.c
> +++ b/target/riscv/translate.c
> @@ -961,20 +961,15 @@ static void riscv_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
> tcg_gen_insn_start(ctx->base.pc_next);
> }
>
> -static bool riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
> - const CPUBreakpoint *bp)
> +static int riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
> + int bp_flags)
> {
> DisasContext *ctx = container_of(dcbase, DisasContext, base);
>
> tcg_gen_movi_tl(cpu_pc, ctx->base.pc_next);
> ctx->base.is_jmp = DISAS_NORETURN;
> gen_exception_debug();
> - /* The address covered by the breakpoint must be included in
> - [tb->pc, tb->pc + tb->size) in order to for it to be
> - properly cleared -- thus we increment the PC here so that
> - the logic setting tb->size below does the right thing. */
> - ctx->base.pc_next += 4;
> - return true;
> + return 2; /* minimum instruction length */
> }
Ditto.
If these are intentional changes (are they bugfixes?) they should be in a
separate patch.
Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
thanks
-- PMM
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags
2021-07-12 15:40 ` [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags Richard Henderson
@ 2021-07-17 17:58 ` Peter Maydell
2021-07-17 19:20 ` Richard Henderson
0 siblings, 1 reply; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 17:58 UTC (permalink / raw)
To: Richard Henderson
Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
Philippe Mathieu-Daudé
On Mon, 12 Jul 2021 at 16:49, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Having this data in cflags means that hashing takes care
> of selecting a TB with or without exceptions built in.
> Which means that we no longer need to flush all TBs.
>
> This does require that we single-step while we're within a page
> that contains a breakpoint, so it's not yet ideal, but should be
> an improvement over some corner-case slowdowns.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/404
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> include/exec/exec-all.h | 7 ++++
> accel/tcg/cpu-exec.c | 68 ++++++++++++++++++++++++++++++-
> accel/tcg/translate-all.c | 4 --
> accel/tcg/translator.c | 85 +++++++++++++++++++++------------------
> cpu.c | 24 -----------
> 5 files changed, 119 insertions(+), 69 deletions(-)
>
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 6873cce8df..7ab2578f71 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -502,9 +502,16 @@ struct TranslationBlock {
> #define CF_USE_ICOUNT 0x00020000
> #define CF_INVALID 0x00040000 /* TB is stale. Set with @jmp_lock held */
> #define CF_PARALLEL 0x00080000 /* Generate code for a parallel context */
> +#define CF_BP_MASK 0x00300000 /* See below */
> +#define CF_BP_SHIFT 20
> #define CF_CLUSTER_MASK 0xff000000 /* Top 8 bits are cluster ID */
> #define CF_CLUSTER_SHIFT 24
>
> +#define CF_BP_NONE (0 << CF_BP_SHIFT) /* TB does not interact with BPs */
> +#define CF_BP_SSTEP (1 << CF_BP_SHIFT) /* gdbstub single-step in effect */
> +#define CF_BP_GDB (2 << CF_BP_SHIFT) /* gdbstub breakpoint at tb->pc */
> +#define CF_BP_CPU (3 << CF_BP_SHIFT) /* arch breakpoint at tb->pc */
> +
> /* Per-vCPU dynamic tracing state used to generate this TB */
> uint32_t trace_vcpu_dstate;
>
> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> index 4d043a11aa..179a425ece 100644
> --- a/accel/tcg/cpu-exec.c
> +++ b/accel/tcg/cpu-exec.c
> @@ -222,6 +222,65 @@ static inline void log_cpu_exec(target_ulong pc, CPUState *cpu,
> }
> }
>
> +static uint32_t cflags_for_breakpoints(CPUState *cpu, target_ulong pc,
> + uint32_t cflags)
> +{
> + uint32_t bflags = 0;
> +
> + if (unlikely(cpu->singlestep_enabled)) {
> + bflags = CF_BP_SSTEP;
> + } else {
Won't this ignore breakpoints when singlestepping ?
-- PMM
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
2021-07-17 17:30 ` Peter Maydell
@ 2021-07-17 18:31 ` Richard Henderson
0 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 18:31 UTC (permalink / raw)
To: Peter Maydell
Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
Philippe Mathieu-Daudé
On 7/17/21 10:30 AM, Peter Maydell wrote:
> On Mon, 12 Jul 2021 at 16:42, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> The space reserved for CF_COUNT_MASK was overly large.
>> Reduce to free up cflags bits and eliminate an extra test.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>> include/exec/exec-all.h | 4 +++-
>> accel/tcg/translate-all.c | 5 ++---
>> 2 files changed, 5 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
>> index 754f4130c9..dfe82ed19c 100644
>> --- a/include/exec/exec-all.h
>> +++ b/include/exec/exec-all.h
>> @@ -492,7 +492,9 @@ struct TranslationBlock {
>> target_ulong cs_base; /* CS base for this block */
>> uint32_t flags; /* flags defining in which context the code was generated */
>> uint32_t cflags; /* compile flags */
>> -#define CF_COUNT_MASK 0x00007fff
>> +
>> +/* Note that TCG_MAX_INSNS is 512; we validate this match elsewhere. */
>> +#define CF_COUNT_MASK 0x000001ff
>> #define CF_LAST_IO 0x00008000 /* Last insn may be an IO access. */
>> #define CF_MEMI_ONLY 0x00010000 /* Only instrument memory ops */
>> #define CF_USE_ICOUNT 0x00020000
>> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
>> index 4df26de858..997e44c73b 100644
>> --- a/accel/tcg/translate-all.c
>> +++ b/accel/tcg/translate-all.c
>> @@ -1430,9 +1430,8 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>> if (max_insns == 0) {
>> max_insns = CF_COUNT_MASK;
>> }
>> - if (max_insns > TCG_MAX_INSNS) {
>> - max_insns = TCG_MAX_INSNS;
>> - }
>> + QEMU_BUILD_BUG_ON(CF_COUNT_MASK + 1 != TCG_MAX_INSNS);
>> +
>
> Previously we would allow max_insns = TCG_MAX_INSNS (512).
> Now we only allow it to be 511...
Hmm. Well, 0 is supposed to map to "max", currently written as
max_insns = cflags & CF_COUNT_MASK;
if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
}
I could just as easily map 0 -> TCG_MAX_INSNS instead.
r~
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
2021-07-17 17:39 ` Alex Bennée
@ 2021-07-17 18:38 ` Richard Henderson
2021-07-17 19:48 ` Alex Bennée
0 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 18:38 UTC (permalink / raw)
To: Alex Bennée; +Cc: mark.cave-ayland, qemu-devel, f4bug
On 7/17/21 10:39 AM, Alex Bennée wrote:
>
> Richard Henderson <richard.henderson@linaro.org> writes:
>
>> The purpose of suppressing goto_ptr from -d nochain had been
>> to return to the main loop so that -d cpu would be recognized.
>
> Hmm is it though? I've always treated it as ensuring we always come out
> into the main loop (which is helpful for debugging).
What's helpful for debugging wrt the main loop beyond logging?
r~
>
>> But we now include -d cpu logging in helper_lookup_tb_ptr so
>> there is no need to exclude goto_ptr.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>> accel/tcg/cpu-exec.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>> index d3232d5764..70ea3c7d68 100644
>> --- a/accel/tcg/cpu-exec.c
>> +++ b/accel/tcg/cpu-exec.c
>> @@ -150,7 +150,7 @@ uint32_t curr_cflags(CPUState *cpu)
>> uint32_t cflags = cpu->tcg_cflags;
>>
>> if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
>> - cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
>> + cflags |= CF_NO_GOTO_TB;
>> }
>>
>> return cflags;
>
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags
2021-07-17 17:42 ` Alex Bennée
@ 2021-07-17 18:42 ` Richard Henderson
2021-07-17 19:51 ` Peter Maydell
0 siblings, 1 reply; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 18:42 UTC (permalink / raw)
To: Alex Bennée; +Cc: mark.cave-ayland, qemu-devel, f4bug
On 7/17/21 10:42 AM, Alex Bennée wrote:
>
> Richard Henderson <richard.henderson@linaro.org> writes:
>
>> Exchange the test in translator_use_goto_tb for CF_NO_GOTO_TB,
>> and the test in tb_gen_code for setting CF_COUNT_MASK to 1.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>> accel/tcg/cpu-exec.c | 8 +++++++-
>> accel/tcg/translate-all.c | 2 +-
>> accel/tcg/translator.c | 2 +-
>> 3 files changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>> index 70ea3c7d68..2206c463f5 100644
>> --- a/accel/tcg/cpu-exec.c
>> +++ b/accel/tcg/cpu-exec.c
>> @@ -149,7 +149,13 @@ uint32_t curr_cflags(CPUState *cpu)
>> {
>> uint32_t cflags = cpu->tcg_cflags;
>>
>> - if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
>> + /*
>> + * For singlestep and -d nochain, suppress goto_tb so that
>> + * we can log -d cpu,exec after every TB.
>> + */
>> + if (singlestep) {
>
> Hmm we are testing a magic global here and looking at
> cpu->singlestep_enabled lower down. We have a transient singlestep which
> is turned on an off via cpu->singlestep_enabled and a global as a debug
> option. Can we rationalise it further?
Not sure what you're asking.
cpu->singlestep_enabled raises a debug exception after one insn, whereas singlestep merely
exits the tb after one insn.
What is it that you want me to rationalize?
r~
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic
2021-07-17 17:43 ` Peter Maydell
@ 2021-07-17 19:03 ` Richard Henderson
0 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 19:03 UTC (permalink / raw)
To: Peter Maydell
Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
Philippe Mathieu-Daudé
On 7/17/21 10:43 AM, Peter Maydell wrote:
> On Mon, 12 Jul 2021 at 16:46, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> Request that the one TB returns immediately, so that
>> we release the exclusive lock as soon as possible.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>> accel/tcg/cpu-exec.c | 11 ++++++++---
>> 1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>> index 2206c463f5..5bb099174f 100644
>> --- a/accel/tcg/cpu-exec.c
>> +++ b/accel/tcg/cpu-exec.c
>> @@ -330,8 +330,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
>> CPUArchState *env = (CPUArchState *)cpu->env_ptr;
>> TranslationBlock *tb;
>> target_ulong cs_base, pc;
>> - uint32_t flags;
>> - uint32_t cflags = (curr_cflags(cpu) & ~CF_PARALLEL) | 1;
>> + uint32_t flags, cflags;
>> int tb_exit;
>>
>> if (sigsetjmp(cpu->jmp_env, 0) == 0) {
>> @@ -341,8 +340,14 @@ void cpu_exec_step_atomic(CPUState *cpu)
>> cpu->running = true;
>>
>> cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
>> - tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
>>
>> + cflags = curr_cflags(cpu);
>> + /* Execute in a serial context. */
>> + cflags &= ~CF_PARALLEL;
>> + /* After 1 insn, return and release the exclusive lock. */
>> + cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR | 1;
>> +
>> + tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
>> if (tb == NULL) {
>> mmap_lock();
>> tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
>
> So previously we would have executed possibly a chain of TBs
> before releasing the lock, and now we definitely execute just one?
Correct.
> (I guess the execute-a-chain case is unlikely given the TB
> only has one insn and we know it's an exclusive insn...)
I think it's actually likely. While the tb would definitely end after one insn, we had
passed nothing down that would lead to returning to the main loop.
r~
>
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>
> thanks
> -- PMM
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags
2021-07-17 17:58 ` Peter Maydell
@ 2021-07-17 19:20 ` Richard Henderson
0 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 19:20 UTC (permalink / raw)
To: Peter Maydell
Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
Philippe Mathieu-Daudé
On 7/17/21 10:58 AM, Peter Maydell wrote:
>> +static uint32_t cflags_for_breakpoints(CPUState *cpu, target_ulong pc,
>> + uint32_t cflags)
>> +{
>> + uint32_t bflags = 0;
>> +
>> + if (unlikely(cpu->singlestep_enabled)) {
>> + bflags = CF_BP_SSTEP;
>> + } else {
>
> Won't this ignore breakpoints when singlestepping ?
Single-step already has priority over other breakpoints:
> /* Pass breakpoint hits to target for further processing */
> - if (!db->singlestep_enabled
> - && unlikely(!QTAILQ_EMPTY(&cpu->breakpoints))) {
r~
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
2021-07-17 17:52 ` Peter Maydell
@ 2021-07-17 19:41 ` Richard Henderson
0 siblings, 0 replies; 37+ messages in thread
From: Richard Henderson @ 2021-07-17 19:41 UTC (permalink / raw)
To: Peter Maydell
Cc: Alex Bennée, Mark Cave-Ayland, QEMU Developers,
Philippe Mathieu-Daudé
On 7/17/21 10:52 AM, Peter Maydell wrote:
> On Mon, 12 Jul 2021 at 16:48, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> We don't need the whole CPUBreakpoint structure in the check,
>> only the flags. Return the instruction length to consolidate
>> the adjustment of db->pc_next.
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>
>
>> diff --git a/target/avr/translate.c b/target/avr/translate.c
>> index 8237a03c23..73ff467926 100644
>> --- a/target/avr/translate.c
>> +++ b/target/avr/translate.c
>> @@ -2944,13 +2944,13 @@ static void avr_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
>> tcg_gen_insn_start(ctx->npc);
>> }
>>
>> -static bool avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
>> - const CPUBreakpoint *bp)
>> +static int avr_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
>> + int bp_flags)
>> {
>> DisasContext *ctx = container_of(dcbase, DisasContext, base);
>>
>> gen_breakpoint(ctx);
>> - return true;
>> + return 2; /* minimum instruction length */
>
> Here we weren't advancing pc_next at all, and now we do.
>
>> diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
>> index 47c967acbf..c7b9d813c2 100644
>> --- a/target/mips/tcg/translate.c
>> +++ b/target/mips/tcg/translate.c
>> @@ -16190,22 +16190,16 @@ static void mips_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
>> ctx->btarget);
>> }
>>
>> -static bool mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
>> - const CPUBreakpoint *bp)
>> +static int mips_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cs,
>> + int bp_flags)
>> {
>> DisasContext *ctx = container_of(dcbase, DisasContext, base);
>>
>> save_cpu_state(ctx, 1);
>> ctx->base.is_jmp = DISAS_NORETURN;
>> gen_helper_raise_exception_debug(cpu_env);
>> - /*
>> - * The address covered by the breakpoint must be included in
>> - * [tb->pc, tb->pc + tb->size) in order to for it to be
>> - * properly cleared -- thus we increment the PC here so that
>> - * the logic setting tb->size below does the right thing.
>> - */
>> - ctx->base.pc_next += 4;
>> - return true;
>> +
>> + return 2; /* minimum instruction length */
>> }
>
> Here we were advancing by 4 and now advance by 2.
>
>> diff --git a/target/riscv/translate.c b/target/riscv/translate.c
>> index deda0c8a44..8a6bc58572 100644
>> --- a/target/riscv/translate.c
>> +++ b/target/riscv/translate.c
>> @@ -961,20 +961,15 @@ static void riscv_tr_insn_start(DisasContextBase *dcbase, CPUState *cpu)
>> tcg_gen_insn_start(ctx->base.pc_next);
>> }
>>
>> -static bool riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
>> - const CPUBreakpoint *bp)
>> +static int riscv_tr_breakpoint_check(DisasContextBase *dcbase, CPUState *cpu,
>> + int bp_flags)
>> {
>> DisasContext *ctx = container_of(dcbase, DisasContext, base);
>>
>> tcg_gen_movi_tl(cpu_pc, ctx->base.pc_next);
>> ctx->base.is_jmp = DISAS_NORETURN;
>> gen_exception_debug();
>> - /* The address covered by the breakpoint must be included in
>> - [tb->pc, tb->pc + tb->size) in order to for it to be
>> - properly cleared -- thus we increment the PC here so that
>> - the logic setting tb->size below does the right thing. */
>> - ctx->base.pc_next += 4;
>> - return true;
>> + return 2; /* minimum instruction length */
>> }
>
> Ditto.
>
> If these are intentional changes (are they bugfixes?) they should be in a
> separate patch.
Yes, they are bug fixes.
r~
>
> Otherwise
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
>
> thanks
> -- PMM
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
2021-07-17 18:38 ` Richard Henderson
@ 2021-07-17 19:48 ` Alex Bennée
0 siblings, 0 replies; 37+ messages in thread
From: Alex Bennée @ 2021-07-17 19:48 UTC (permalink / raw)
To: Richard Henderson; +Cc: mark.cave-ayland, qemu-devel, f4bug
Richard Henderson <richard.henderson@linaro.org> writes:
> On 7/17/21 10:39 AM, Alex Bennée wrote:
>> Richard Henderson <richard.henderson@linaro.org> writes:
>>
>>> The purpose of suppressing goto_ptr from -d nochain had been
>>> to return to the main loop so that -d cpu would be recognized.
>> Hmm is it though? I've always treated it as ensuring we always come
>> out
>> into the main loop (which is helpful for debugging).
>
> What's helpful for debugging wrt the main loop beyond logging?
Usually if I rr a bug I reverse continue to the top of the loop for my
re-run. I guess we can put breakpoints elsewhere it's just another place
to remember.
>
>
> r~
>
>>
>>> But we now include -d cpu logging in helper_lookup_tb_ptr so
>>> there is no need to exclude goto_ptr.
>>>
>>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>>> ---
>>> accel/tcg/cpu-exec.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
>>> index d3232d5764..70ea3c7d68 100644
>>> --- a/accel/tcg/cpu-exec.c
>>> +++ b/accel/tcg/cpu-exec.c
>>> @@ -150,7 +150,7 @@ uint32_t curr_cflags(CPUState *cpu)
>>> uint32_t cflags = cpu->tcg_cflags;
>>> if (qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
>>> - cflags |= CF_NO_GOTO_TB | CF_NO_GOTO_PTR;
>>> + cflags |= CF_NO_GOTO_TB;
>>> }
>>> return cflags;
>>
--
Alex Bennée
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags
2021-07-17 18:42 ` Richard Henderson
@ 2021-07-17 19:51 ` Peter Maydell
0 siblings, 0 replies; 37+ messages in thread
From: Peter Maydell @ 2021-07-17 19:51 UTC (permalink / raw)
To: Richard Henderson
Cc: Mark Cave-Ayland, Alex Bennée, QEMU Developers,
Philippe Mathieu-Daudé
On Sat, 17 Jul 2021 at 19:43, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 7/17/21 10:42 AM, Alex Bennée wrote:
> > Hmm we are testing a magic global here and looking at
> > cpu->singlestep_enabled lower down. We have a transient singlestep which
> > is turned on an off via cpu->singlestep_enabled and a global as a debug
> > option. Can we rationalise it further?
>
> Not sure what you're asking.
>
> cpu->singlestep_enabled raises a debug exception after one insn, whereas singlestep merely
> exits the tb after one insn.
We really should rename 'singlestep' to 'one_insn_per_tb' or something,
because it's continually confusing...
-- PMM
^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2021-07-17 19:53 UTC | newest]
Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-12 15:39 [PATCH v2 00/10] tcg: breakpoint reorg Richard Henderson
2021-07-12 15:39 ` [PATCH v2 01/10] accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS Richard Henderson
2021-07-17 17:30 ` Peter Maydell
2021-07-17 18:31 ` Richard Henderson
2021-07-17 17:33 ` Alex Bennée
2021-07-12 15:39 ` [PATCH v2 02/10] accel/tcg: Move curr_cflags into cpu-exec.c Richard Henderson
2021-07-17 17:32 ` Peter Maydell
2021-07-17 17:34 ` Alex Bennée
2021-07-12 15:39 ` [PATCH v2 03/10] accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR Richard Henderson
2021-07-17 17:36 ` Alex Bennée
2021-07-17 17:36 ` Peter Maydell
2021-07-12 15:39 ` [PATCH v2 04/10] accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain Richard Henderson
2021-07-17 17:39 ` Peter Maydell
2021-07-17 17:39 ` Alex Bennée
2021-07-17 18:38 ` Richard Henderson
2021-07-17 19:48 ` Alex Bennée
2021-07-12 15:39 ` [PATCH v2 05/10] accel/tcg: Handle -singlestep in curr_cflags Richard Henderson
2021-07-17 17:41 ` Peter Maydell
2021-07-17 17:42 ` Alex Bennée
2021-07-17 18:42 ` Richard Henderson
2021-07-17 19:51 ` Peter Maydell
2021-07-12 15:40 ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB, PTR} in cpu_exec_step_atomic Richard Henderson
2021-07-17 17:43 ` Peter Maydell
2021-07-17 19:03 ` Richard Henderson
2021-07-17 17:47 ` [PATCH v2 06/10] accel/tcg: Use CF_NO_GOTO_{TB,PTR} " Alex Bennée
2021-07-12 15:40 ` [PATCH v2 07/10] accel/tcg: Move cflags lookup into tb_find Richard Henderson
2021-07-17 17:45 ` Peter Maydell
2021-07-12 15:40 ` [PATCH v2 08/10] accel/tcg: Adjust interface of TranslatorOps.breakpoint_check Richard Henderson
2021-07-17 17:52 ` Peter Maydell
2021-07-17 19:41 ` Richard Henderson
2021-07-12 15:40 ` [PATCH v2 09/10] accel/tcg: Hoist tb_cflags to a local in translator_loop Richard Henderson
2021-07-17 17:50 ` Alex Bennée
2021-07-12 15:40 ` [PATCH v2 10/10] accel/tcg: Encode breakpoint info into tb->cflags Richard Henderson
2021-07-17 17:58 ` Peter Maydell
2021-07-17 19:20 ` Richard Henderson
2021-07-12 18:20 ` [PATCH v2 00/10] tcg: breakpoint reorg Mark Cave-Ayland
2021-07-17 17:16 ` Richard Henderson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.