All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation
@ 2015-09-22 20:24 Richard Henderson
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 01/25] tcg: Rename debug_insn_start to insn_start Richard Henderson
                   ` (24 more replies)
  0 siblings, 25 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

Version 3.  Notable changes:

  (1) Add a guard page at the end of the code_gen_buffer.
      We will segv insted of silently corrupting memory
      if we overrun the buffer.

      The win32 bits tested under wine only; I haven't put together
      all the right bits under my win7 vm yet.  Although I ought to
      be able to copy them from the wine installation...

  (2) Overflow protection via highwater mark.  At first I was going
      to make this be the solution for win32 only, so that I didn't
      have to figure out how to make SEH dtrt wrt catching #GPF.

      But I can't actually measure the performance overhead of these
      checks under Linux.  Which might not be the case if we instead
      have to call sigsetjmp at the beginning of tb_gen_code.  So now
      I'm thinking this might be better solution universally.


r~


Richard Henderson (25):
  tcg: Rename debug_insn_start to insn_start
  target-*: Unconditionally emit tcg_gen_insn_start
  target-*: Increment num_insns immediately after tcg_gen_insn_start
  target-*: Introduce and use cpu_breakpoint_test
  tcg: Allow extra data to be attached to insn_start
  target-arm: Add condexec state to insn_start
  target-i386: Add cc_op state to insn_start
  target-mips: Add delayed branch state to insn_start
  target-s390x: Add cc_op state to insn_start
  target-sh4: Add flags state to insn_start
  target-cris: Mirror gen_opc_pc into insn_start
  target-sparc: Tidy gen_branch_a interface
  target-sparc: Split out gen_branch_n
  target-sparc: Remove gen_opc_jump_pc
  target-sparc: Add npc state to insn_start
  tcg: Merge cpu_gen_code into tb_gen_code
  target-*: Drop cpu_gen_code define
  tcg: Add TCG_MAX_INSNS
  tcg: Pass data argument to restore_state_to_opc
  tcg: Save insn data and use it in cpu_restore_state_from_tb
  tcg: Remove gen_intermediate_code_pc
  tcg: Remove tcg_gen_code_search_pc
  tcg: Emit prologue to the beginning of code_gen_buffer
  tcg: Allocate a guard page after code_gen_buffer
  tcg: Check for overflow via highwater mark

 include/exec/exec-all.h       |  12 +-
 include/qom/cpu.h             |  16 ++
 target-alpha/cpu.h            |   1 -
 target-alpha/translate.c      |  70 ++----
 target-arm/cpu.h              |   2 +-
 target-arm/translate-a64.c    |  48 +---
 target-arm/translate.c        |  83 +++----
 target-arm/translate.h        |   8 +-
 target-cris/cpu.h             |   1 -
 target-cris/translate.c       |  93 ++------
 target-cris/translate_v10.c   |   3 -
 target-i386/cpu.h             |   2 +-
 target-i386/translate.c       | 106 +++------
 target-lm32/cpu.h             |   1 -
 target-lm32/translate.c       |  83 ++-----
 target-m68k/cpu.h             |   1 -
 target-m68k/translate.c       |  82 ++-----
 target-microblaze/cpu.h       |   1 -
 target-microblaze/translate.c |  83 ++-----
 target-mips/cpu.h             |   2 +-
 target-mips/translate.c       |  98 +++-----
 target-moxie/cpu.h            |   1 -
 target-moxie/translate.c      |  82 +++----
 target-openrisc/cpu.h         |   1 -
 target-openrisc/translate.c   |  78 ++-----
 target-ppc/cpu.h              |   1 -
 target-ppc/translate.c        |  72 ++----
 target-s390x/cpu.h            |   2 +-
 target-s390x/translate.c      |  78 ++-----
 target-sh4/cpu.h              |   2 +-
 target-sh4/translate.c        |  91 +++-----
 target-sparc/cpu.h            |   2 +-
 target-sparc/translate.c      | 185 +++++++--------
 target-tilegx/cpu.h           |   1 -
 target-tilegx/translate.c     |  58 ++---
 target-tricore/translate.c    |  59 ++---
 target-unicore32/translate.c  |  83 ++-----
 target-xtensa/cpu.h           |   1 -
 target-xtensa/translate.c     |  79 ++-----
 tcg/tcg-op.h                  |  52 ++++-
 tcg/tcg-opc.h                 |   4 +-
 tcg/tcg.c                     | 158 +++++++------
 tcg/tcg.h                     |  21 +-
 tci.c                         |   9 -
 translate-all.c               | 520 +++++++++++++++++++++++++-----------------
 45 files changed, 950 insertions(+), 1486 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 01/25] tcg: Rename debug_insn_start to insn_start
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 02/25] target-*: Unconditionally emit tcg_gen_insn_start Richard Henderson
                   ` (23 subsequent siblings)
  24 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

With an eye toward making it mandatory.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-alpha/translate.c      | 2 +-
 target-arm/translate-a64.c    | 2 +-
 target-arm/translate.c        | 2 +-
 target-cris/translate.c       | 4 ++--
 target-cris/translate_v10.c   | 2 +-
 target-i386/translate.c       | 2 +-
 target-lm32/translate.c       | 2 +-
 target-m68k/translate.c       | 2 +-
 target-microblaze/translate.c | 2 +-
 target-mips/translate.c       | 2 +-
 target-moxie/translate.c      | 2 +-
 target-openrisc/translate.c   | 2 +-
 target-ppc/translate.c        | 2 +-
 target-s390x/translate.c      | 2 +-
 target-sh4/translate.c        | 2 +-
 target-sparc/translate.c      | 2 +-
 target-tilegx/translate.c     | 2 +-
 target-unicore32/translate.c  | 2 +-
 target-xtensa/translate.c     | 2 +-
 tcg/tcg-op.h                  | 6 +++---
 tcg/tcg-opc.h                 | 4 ++--
 tcg/tcg.c                     | 6 +++---
 tci.c                         | 9 ---------
 23 files changed, 28 insertions(+), 37 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 2ba5fb8..76916f4 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -2940,7 +2940,7 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
         num_insns++;
 
 	if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_debug_insn_start(ctx.pc);
+            tcg_gen_insn_start(ctx.pc);
         }
 
         TCGV_UNUSED_I64(ctx.zero);
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index ec0936c..a618711 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -11109,7 +11109,7 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
         }
 
         if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_debug_insn_start(dc->pc);
+            tcg_gen_insn_start(dc->pc);
         }
 
         if (dc->ss_active && !dc->pstate_ss) {
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 84a21ac..b521fc8 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11353,7 +11353,7 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
             gen_io_start();
 
         if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_debug_insn_start(dc->pc);
+            tcg_gen_insn_start(dc->pc);
         }
 
         if (dc->ss_active && !dc->pstate_ss) {
diff --git a/target-cris/translate.c b/target-cris/translate.c
index d5b54e1..c5a22af 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -2995,8 +2995,8 @@ static unsigned int crisv32_decoder(CPUCRISState *env, DisasContext *dc)
     int i;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_debug_insn_start(dc->pc);
-        }
+        tcg_gen_insn_start(dc->pc);
+    }
 
     /* Load a halfword onto the instruction register.  */
         dc->ir = cris_fetch(env, dc, dc->pc, 2, 0);
diff --git a/target-cris/translate_v10.c b/target-cris/translate_v10.c
index da0b2ca..12d7dfc 100644
--- a/target-cris/translate_v10.c
+++ b/target-cris/translate_v10.c
@@ -1200,7 +1200,7 @@ static unsigned int crisv10_decoder(CPUCRISState *env, DisasContext *dc)
     unsigned int insn_len = 2;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP)))
-        tcg_gen_debug_insn_start(dc->pc);
+        tcg_gen_insn_start(dc->pc);
 
     /* Load a halfword onto the instruction register.  */
     dc->ir = cpu_lduw_code(env, dc->pc);
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 8b35de1..c18f82b 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -4402,7 +4402,7 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s,
     int rex_w, rex_r;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_debug_insn_start(pc_start);
+        tcg_gen_insn_start(pc_start);
     }
     s->pc = pc_start;
     prefixes = 0;
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index cf7042e..b1b4cbb 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1006,7 +1006,7 @@ static const DecoderInfo decinfo[] = {
 static inline void decode(DisasContext *dc, uint32_t ir)
 {
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_debug_insn_start(dc->pc);
+        tcg_gen_insn_start(dc->pc);
     }
 
     dc->ir = ir;
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 3cdf665..e34bf2b 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2956,7 +2956,7 @@ static void disas_m68k_insn(CPUM68KState * env, DisasContext *s)
     uint16_t insn;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_debug_insn_start(s->pc);
+        tcg_gen_insn_start(s->pc);
     }
 
     insn = cpu_lduw_code(env, s->pc);
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index 3de8944..0d340c0 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1589,7 +1589,7 @@ static inline void decode(DisasContext *dc, uint32_t ir)
     int i;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_debug_insn_start(dc->pc);
+        tcg_gen_insn_start(dc->pc);
     }
 
     dc->ir = ir;
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 87d4959..2b3f2b0 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -18905,7 +18905,7 @@ static void decode_opc(CPUMIPSState *env, DisasContext *ctx)
     }
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_debug_insn_start(ctx->pc);
+        tcg_gen_insn_start(ctx->pc);
     }
 
     op = MASK_OP_MAJOR(ctx->opcode);
diff --git a/target-moxie/translate.c b/target-moxie/translate.c
index cc77366..0bb94a0 100644
--- a/target-moxie/translate.c
+++ b/target-moxie/translate.c
@@ -154,7 +154,7 @@ static int decode_opc(MoxieCPU *cpu, DisasContext *ctx)
     int length = 2;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_debug_insn_start(ctx->pc);
+        tcg_gen_insn_start(ctx->pc);
     }
 
     /* Examine the 16-bit opcode.  */
diff --git a/target-openrisc/translate.c b/target-openrisc/translate.c
index 473556e..727fbba 100644
--- a/target-openrisc/translate.c
+++ b/target-openrisc/translate.c
@@ -1689,7 +1689,7 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
         }
 
         if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_debug_insn_start(dc->pc);
+            tcg_gen_insn_start(dc->pc);
         }
 
         if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index c0eed13..c46133d 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11516,7 +11516,7 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
                     ctx.opcode, opc1(ctx.opcode), opc2(ctx.opcode),
                     opc3(ctx.opcode), ctx.le_mode ? "little" : "big");
         if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_debug_insn_start(ctx.nip);
+            tcg_gen_insn_start(ctx.nip);
         }
         ctx.nip += 4;
         table = env->opcodes;
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index 2bca33a..a87d83c 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -5375,7 +5375,7 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
         }
 
         if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_debug_insn_start(dc.pc);
+            tcg_gen_insn_start(dc.pc);
         }
 
         status = NO_EXIT;
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index 724c0e7..d9d2c02 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1791,7 +1791,7 @@ static void decode_opc(DisasContext * ctx)
     uint32_t old_flags = ctx->flags;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_debug_insn_start(ctx->pc);
+        tcg_gen_insn_start(ctx->pc);
     }
 
     _decode_opc(ctx);
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 4690b46..ef17e26 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -2483,7 +2483,7 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
     target_long simm;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_debug_insn_start(dc->pc);
+        tcg_gen_insn_start(dc->pc);
     }
 
     opc = GET_FIELD(insn, 0, 1);
diff --git a/target-tilegx/translate.c b/target-tilegx/translate.c
index e70c3e5..3fb7fc6 100644
--- a/target-tilegx/translate.c
+++ b/target-tilegx/translate.c
@@ -2009,7 +2009,7 @@ static void translate_one_bundle(DisasContext *dc, uint64_t bundle)
     dc->num_wb = 0;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_debug_insn_start(dc->pc);
+        tcg_gen_insn_start(dc->pc);
     }
 
     qemu_log_mask(CPU_LOG_TB_IN_ASM, "  %" PRIx64 ":  { ", dc->pc);
diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c
index 2fc78e6..63a5192 100644
--- a/target-unicore32/translate.c
+++ b/target-unicore32/translate.c
@@ -1795,7 +1795,7 @@ static void disas_uc32_insn(CPUUniCore32State *env, DisasContext *s)
     unsigned int insn;
 
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_debug_insn_start(s->pc);
+        tcg_gen_insn_start(s->pc);
     }
 
     insn = cpu_ldl_code(env, s->pc);
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index a29b3e6..ea777da 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -3078,7 +3078,7 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
         }
 
         if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_debug_insn_start(dc.pc);
+            tcg_gen_insn_start(dc.pc);
         }
 
         ++dc.ccount_delta;
diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index 6da083a..6409db8 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -701,14 +701,14 @@ static inline void tcg_gen_concat32_i64(TCGv_i64 ret, TCGv_i64 lo, TCGv_i64 hi)
 #endif
 
 /* debug info: write the PC of the corresponding QEMU CPU instruction */
-static inline void tcg_gen_debug_insn_start(uint64_t pc)
+static inline void tcg_gen_insn_start(uint64_t pc)
 {
     /* XXX: must really use a 32 bit size for TCGArg in all cases */
 #if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
-    tcg_gen_op2ii(INDEX_op_debug_insn_start,
+    tcg_gen_op2ii(INDEX_op_insn_start,
                   (uint32_t)(pc), (uint32_t)(pc >> 32));
 #else
-    tcg_gen_op1i(INDEX_op_debug_insn_start, pc);
+    tcg_gen_op1i(INDEX_op_insn_start, pc);
 #endif
 }
 
diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h
index 02bbf30..f60d3c2 100644
--- a/tcg/tcg-opc.h
+++ b/tcg/tcg-opc.h
@@ -175,9 +175,9 @@ DEF(mulsh_i64, 1, 2, 0, IMPL(TCG_TARGET_HAS_mulsh_i64))
 
 /* QEMU specific */
 #if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
-DEF(debug_insn_start, 0, 0, 2, TCG_OPF_NOT_PRESENT)
+DEF(insn_start, 0, 0, 2, TCG_OPF_NOT_PRESENT)
 #else
-DEF(debug_insn_start, 0, 0, 1, TCG_OPF_NOT_PRESENT)
+DEF(insn_start, 0, 0, 1, TCG_OPF_NOT_PRESENT)
 #endif
 DEF(exit_tb, 0, 0, 1, TCG_OPF_BB_END)
 DEF(goto_tb, 0, 0, 1, TCG_OPF_BB_END)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index a2cb027..df8788b 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -990,7 +990,7 @@ void tcg_dump_ops(TCGContext *s)
         def = &tcg_op_defs[c];
         args = &s->gen_opparam_buf[op->args];
 
-        if (c == INDEX_op_debug_insn_start) {
+        if (c == INDEX_op_insn_start) {
             uint64_t pc;
 #if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
             pc = ((uint64_t)args[1] << 32) | args[0];
@@ -1400,7 +1400,7 @@ static void tcg_liveness_analysis(TCGContext *s)
                 }
             }
             break;
-        case INDEX_op_debug_insn_start:
+        case INDEX_op_insn_start:
             break;
         case INDEX_op_discard:
             /* mark the temporary as dead */
@@ -2359,7 +2359,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
         case INDEX_op_movi_i64:
             tcg_reg_alloc_movi(s, args, dead_args, sync_args);
             break;
-        case INDEX_op_debug_insn_start:
+        case INDEX_op_insn_start:
             break;
         case INDEX_op_discard:
             temp_dead(s, args[0]);
diff --git a/tci.c b/tci.c
index 70eaab2..b5ed7b1 100644
--- a/tci.c
+++ b/tci.c
@@ -1081,15 +1081,6 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
 
             /* QEMU specific operations. */
 
-#if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
-        case INDEX_op_debug_insn_start:
-            TODO();
-            break;
-#else
-        case INDEX_op_debug_insn_start:
-            TODO();
-            break;
-#endif
         case INDEX_op_exit_tb:
             next_tb = *(uint64_t *)tb_ptr;
             goto exit;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 02/25] target-*: Unconditionally emit tcg_gen_insn_start
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 01/25] tcg: Rename debug_insn_start to insn_start Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 03/25] target-*: Increment num_insns immediately after tcg_gen_insn_start Richard Henderson
                   ` (22 subsequent siblings)
  24 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

While we're at it, emit the opcode adjacent to where we currently
record data for search_pc.  This puts gen_io_start et al on the
"correct" side of the marker.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-alpha/translate.c      |  6 ++----
 target-arm/translate-a64.c    |  5 +----
 target-arm/translate.c        |  5 +----
 target-cris/translate.c       |  5 +----
 target-cris/translate_v10.c   |  3 ---
 target-i386/translate.c       |  5 ++---
 target-lm32/translate.c       |  5 +----
 target-m68k/translate.c       | 10 +++++-----
 target-microblaze/translate.c |  5 +----
 target-mips/translate.c       |  9 ++++-----
 target-moxie/translate.c      |  6 ++----
 target-openrisc/translate.c   |  5 +----
 target-ppc/translate.c        |  5 ++---
 target-s390x/translate.c      |  6 ++----
 target-sh4/translate.c        | 14 +++++---------
 target-sparc/translate.c      | 10 +++++-----
 target-tilegx/translate.c     |  6 ++----
 target-tricore/translate.c    |  2 ++
 target-unicore32/translate.c  |  5 +----
 target-xtensa/translate.c     |  5 +----
 20 files changed, 41 insertions(+), 81 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 76916f4..60370d6 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -2933,16 +2933,14 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
+        tcg_gen_insn_start(ctx.pc);
+
         if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
         insn = cpu_ldl_code(env, ctx.pc);
         num_insns++;
 
-	if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_insn_start(ctx.pc);
-        }
-
         TCGV_UNUSED_I64(ctx.zero);
         TCGV_UNUSED_I64(ctx.sink);
         TCGV_UNUSED_I64(ctx.lit);
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index a618711..6a66ac0 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -11103,15 +11103,12 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
+        tcg_gen_insn_start(dc->pc);
 
         if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
 
-        if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_insn_start(dc->pc);
-        }
-
         if (dc->ss_active && !dc->pstate_ss) {
             /* Singlestep state is Active-pending.
              * If we're in this state at the start of a TB then either
diff --git a/target-arm/translate.c b/target-arm/translate.c
index b521fc8..8348848 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11348,14 +11348,11 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
+        tcg_gen_insn_start(dc->pc);
 
         if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
             gen_io_start();
 
-        if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_insn_start(dc->pc);
-        }
-
         if (dc->ss_active && !dc->pstate_ss) {
             /* Singlestep state is Active-pending.
              * If we're in this state at the start of a TB then either
diff --git a/target-cris/translate.c b/target-cris/translate.c
index c5a22af..0a4b363 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -2994,10 +2994,6 @@ static unsigned int crisv32_decoder(CPUCRISState *env, DisasContext *dc)
     int insn_len = 2;
     int i;
 
-    if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_insn_start(dc->pc);
-    }
-
     /* Load a halfword onto the instruction register.  */
         dc->ir = cris_fetch(env, dc, dc->pc, 2, 0);
 
@@ -3197,6 +3193,7 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
+        tcg_gen_insn_start(dc->pc);
 
         /* Pretty disas.  */
         LOG_DIS("%8.8x:\t", dc->pc);
diff --git a/target-cris/translate_v10.c b/target-cris/translate_v10.c
index 12d7dfc..3ab1c39 100644
--- a/target-cris/translate_v10.c
+++ b/target-cris/translate_v10.c
@@ -1199,9 +1199,6 @@ static unsigned int crisv10_decoder(CPUCRISState *env, DisasContext *dc)
 {
     unsigned int insn_len = 2;
 
-    if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP)))
-        tcg_gen_insn_start(dc->pc);
-
     /* Load a halfword onto the instruction register.  */
     dc->ir = cpu_lduw_code(env, dc->pc);
 
diff --git a/target-i386/translate.c b/target-i386/translate.c
index c18f82b..82d32e1 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -4401,9 +4401,6 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s,
     target_ulong next_eip, tval;
     int rex_w, rex_r;
 
-    if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_insn_start(pc_start);
-    }
     s->pc = pc_start;
     prefixes = 0;
     s->override = -1;
@@ -7962,6 +7959,8 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
+        tcg_gen_insn_start(pc_ptr);
+
         if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
             gen_io_start();
 
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index b1b4cbb..84eeac3 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1005,10 +1005,6 @@ static const DecoderInfo decinfo[] = {
 
 static inline void decode(DisasContext *dc, uint32_t ir)
 {
-    if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_insn_start(dc->pc);
-    }
-
     dc->ir = ir;
     LOG_DIS("%8.8x\t", dc->ir);
 
@@ -1106,6 +1102,7 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
+        tcg_gen_insn_start(dc->pc);
 
         /* Pretty disas.  */
         LOG_DIS("%8.8x:\t", dc->pc);
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index e34bf2b..bfd9c00 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2955,10 +2955,6 @@ static void disas_m68k_insn(CPUM68KState * env, DisasContext *s)
 {
     uint16_t insn;
 
-    if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_insn_start(s->pc);
-    }
-
     insn = cpu_lduw_code(env, s->pc);
     s->pc += 2;
 
@@ -3025,8 +3021,12 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
+        tcg_gen_insn_start(dc->pc);
+
+        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
+        }
+
         dc->insn_pc = dc->pc;
 	disas_m68k_insn(env, dc);
         num_insns++;
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index 0d340c0..02ccf45 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1588,10 +1588,6 @@ static inline void decode(DisasContext *dc, uint32_t ir)
 {
     int i;
 
-    if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_insn_start(dc->pc);
-    }
-
     dc->ir = ir;
     LOG_DIS("%8.8x\t", dc->ir);
 
@@ -1718,6 +1714,7 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
                         tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
+        tcg_gen_insn_start(dc->pc);
 
         /* Pretty disas.  */
         LOG_DIS("%8.8x:\t", dc->pc);
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 2b3f2b0..aa0e0fd 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -18904,10 +18904,6 @@ static void decode_opc(CPUMIPSState *env, DisasContext *ctx)
         gen_set_label(l1);
     }
 
-    if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_insn_start(ctx->pc);
-    }
-
     op = MASK_OP_MAJOR(ctx->opcode);
     rs = (ctx->opcode >> 21) & 0x1f;
     rt = (ctx->opcode >> 16) & 0x1f;
@@ -19622,8 +19618,11 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
+        tcg_gen_insn_start(ctx.pc);
+
+        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
+        }
 
         is_slot = ctx.hflags & MIPS_HFLAG_BMASK;
         if (!(ctx.hflags & MIPS_HFLAG_M16)) {
diff --git a/target-moxie/translate.c b/target-moxie/translate.c
index 0bb94a0..1becfde 100644
--- a/target-moxie/translate.c
+++ b/target-moxie/translate.c
@@ -153,10 +153,6 @@ static int decode_opc(MoxieCPU *cpu, DisasContext *ctx)
     /* Set the default instruction length.  */
     int length = 2;
 
-    if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_insn_start(ctx->pc);
-    }
-
     /* Examine the 16-bit opcode.  */
     opcode = ctx->opcode;
 
@@ -865,6 +861,8 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
+        tcg_gen_insn_start(ctx.pc);
+
         ctx.opcode = cpu_lduw_code(env, ctx.pc);
         ctx.pc += decode_opc(cpu, &ctx);
         num_insns++;
diff --git a/target-openrisc/translate.c b/target-openrisc/translate.c
index 727fbba..4f9b768 100644
--- a/target-openrisc/translate.c
+++ b/target-openrisc/translate.c
@@ -1687,10 +1687,7 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
             tcg_ctx.gen_opc_instr_start[k] = 1;
             tcg_ctx.gen_opc_icount[k] = num_insns;
         }
-
-        if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_insn_start(dc->pc);
-        }
+        tcg_gen_insn_start(dc->pc);
 
         if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index c46133d..6ca3e9f 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11502,6 +11502,8 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
+        tcg_gen_insn_start(ctx.nip);
+
         LOG_DISAS("----------------\n");
         LOG_DISAS("nip=" TARGET_FMT_lx " super=%d ir=%d\n",
                   ctx.nip, ctx.mem_idx, (int)msr_ir);
@@ -11515,9 +11517,6 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
         LOG_DISAS("translate opcode %08x (%02x %02x %02x) (%s)\n",
                     ctx.opcode, opc1(ctx.opcode), opc2(ctx.opcode),
                     opc3(ctx.opcode), ctx.le_mode ? "little" : "big");
-        if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_insn_start(ctx.nip);
-        }
         ctx.nip += 4;
         table = env->opcodes;
         num_insns++;
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index a87d83c..2767f6a 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -5370,14 +5370,12 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
+        tcg_gen_insn_start(dc.pc);
+
         if (++num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
 
-        if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_insn_start(dc.pc);
-        }
-
         status = NO_EXIT;
         if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
             QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index d9d2c02..1e43e6d 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1790,10 +1790,6 @@ static void decode_opc(DisasContext * ctx)
 {
     uint32_t old_flags = ctx->flags;
 
-    if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_insn_start(ctx->pc);
-    }
-
     _decode_opc(ctx);
 
     if (old_flags & (DELAY_SLOT | DELAY_SLOT_CONDITIONAL)) {
@@ -1876,12 +1872,12 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_instr_start[ii] = 1;
             tcg_ctx.gen_opc_icount[ii] = num_insns;
         }
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
+        tcg_gen_insn_start(ctx.pc);
+
+        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
-#if 0
-	fprintf(stderr, "Loading opcode at address 0x%08x\n", ctx.pc);
-	fflush(stderr);
-#endif
+        }
+
         ctx.opcode = cpu_lduw_code(env, ctx.pc);
 	decode_opc(&ctx);
         num_insns++;
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index ef17e26..a47e65f 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -2482,10 +2482,6 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
     TCGv_i64 cpu_src1_64, cpu_src2_64, cpu_dst_64;
     target_long simm;
 
-    if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_insn_start(dc->pc);
-    }
-
     opc = GET_FIELD(insn, 0, 1);
     rd = GET_FIELD(insn, 2, 6);
 
@@ -5271,8 +5267,12 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
                 tcg_ctx.gen_opc_icount[lj] = num_insns;
             }
         }
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
+        tcg_gen_insn_start(dc->pc);
+
+        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
+        }
+
         last_pc = dc->pc;
         insn = cpu_ldl_code(env, dc->pc);
 
diff --git a/target-tilegx/translate.c b/target-tilegx/translate.c
index 3fb7fc6..6babc3c 100644
--- a/target-tilegx/translate.c
+++ b/target-tilegx/translate.c
@@ -2008,10 +2008,6 @@ static void translate_one_bundle(DisasContext *dc, uint64_t bundle)
     }
     dc->num_wb = 0;
 
-    if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_insn_start(dc->pc);
-    }
-
     qemu_log_mask(CPU_LOG_TB_IN_ASM, "  %" PRIx64 ":  { ", dc->pc);
     if (get_Mode(bundle)) {
         notice_excp(dc, bundle, "y0", decode_y0(dc, bundle));
@@ -2100,6 +2096,8 @@ static inline void gen_intermediate_code_internal(TileGXCPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
+        tcg_gen_insn_start(dc->pc);
+
         translate_one_bundle(dc, cpu_ldq_data(env, dc->pc));
 
         if (dc->exit_tb) {
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index 440f30a..27564d3 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -8292,6 +8292,8 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
     tcg_clear_temp_count();
     gen_tb_start(tb);
     while (ctx.bstate == BS_NONE) {
+        tcg_gen_insn_start(ctx.pc);
+
         ctx.opcode = cpu_ldl_code(env, ctx.pc);
         decode_opc(env, &ctx, 0);
 
diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c
index 63a5192..28db34a 100644
--- a/target-unicore32/translate.c
+++ b/target-unicore32/translate.c
@@ -1794,10 +1794,6 @@ static void disas_uc32_insn(CPUUniCore32State *env, DisasContext *s)
     UniCore32CPU *cpu = uc32_env_get_cpu(env);
     unsigned int insn;
 
-    if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-        tcg_gen_insn_start(s->pc);
-    }
-
     insn = cpu_ldl_code(env, s->pc);
     s->pc += 4;
 
@@ -1941,6 +1937,7 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
+        tcg_gen_insn_start(dc->pc);
 
         if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index ea777da..ab9e8f9 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -3076,10 +3076,7 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = insn_count;
         }
-
-        if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP | CPU_LOG_TB_OP_OPT))) {
-            tcg_gen_insn_start(dc.pc);
-        }
+        tcg_gen_insn_start(dc.pc);
 
         ++dc.ccount_delta;
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 03/25] target-*: Increment num_insns immediately after tcg_gen_insn_start
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 01/25] tcg: Rename debug_insn_start to insn_start Richard Henderson
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 02/25] target-*: Unconditionally emit tcg_gen_insn_start Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 04/25] target-*: Introduce and use cpu_breakpoint_test Richard Henderson
                   ` (21 subsequent siblings)
  24 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

This does tidy the icount test common to all targets.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-alpha/translate.c      | 4 ++--
 target-arm/translate-a64.c    | 6 +++---
 target-arm/translate.c        | 7 ++++---
 target-cris/translate.c       | 4 ++--
 target-i386/translate.c       | 5 +++--
 target-lm32/translate.c       | 5 ++---
 target-m68k/translate.c       | 4 ++--
 target-microblaze/translate.c | 5 +++--
 target-mips/translate.c       | 5 ++---
 target-moxie/translate.c      | 2 +-
 target-openrisc/translate.c   | 4 ++--
 target-ppc/translate.c        | 4 ++--
 target-s390x/translate.c      | 3 ++-
 target-sh4/translate.c        | 4 ++--
 target-sparc/translate.c      | 4 ++--
 target-tilegx/translate.c     | 3 ++-
 target-tricore/translate.c    | 3 +--
 target-unicore32/translate.c  | 4 ++--
 target-xtensa/translate.c     | 4 ++--
 19 files changed, 41 insertions(+), 39 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 60370d6..fa0ac2d 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -2934,12 +2934,12 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(ctx.pc);
+        num_insns++;
 
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
         insn = cpu_ldl_code(env, ctx.pc);
-        num_insns++;
 
         TCGV_UNUSED_I64(ctx.zero);
         TCGV_UNUSED_I64(ctx.sink);
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 6a66ac0..4670941 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -11104,8 +11104,9 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(dc->pc);
+        num_insns++;
 
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
 
@@ -11120,7 +11121,7 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
              * "did not step an insn" case, and so the syndrome ISV and EX
              * bits should be zero.
              */
-            assert(num_insns == 0);
+            assert(num_insns == 1);
             gen_exception(EXCP_UDEF, syn_swstep(dc->ss_same_el, 0, 0),
                           default_exception_el(dc));
             dc->is_jmp = DISAS_EXC;
@@ -11139,7 +11140,6 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
          * Also stop translation when a page boundary is reached.  This
          * ensures prefetch aborts occur at the right place.
          */
-        num_insns++;
     } while (!dc->is_jmp && !tcg_op_buf_full() &&
              !cs->singlestep_enabled &&
              !singlestep &&
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 8348848..cd88997 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11349,9 +11349,11 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(dc->pc);
+        num_insns++;
 
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
+        }
 
         if (dc->ss_active && !dc->pstate_ss) {
             /* Singlestep state is Active-pending.
@@ -11364,7 +11366,7 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
              * "did not step an insn" case, and so the syndrome ISV and EX
              * bits should be zero.
              */
-            assert(num_insns == 0);
+            assert(num_insns == 1);
             gen_exception(EXCP_UDEF, syn_swstep(dc->ss_same_el, 0, 0),
                           default_exception_el(dc));
             goto done_generating;
@@ -11400,7 +11402,6 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
          * Otherwise the subsequent code could get translated several times.
          * Also stop translation when a page boundary is reached.  This
          * ensures prefetch aborts occur at the right place.  */
-        num_insns ++;
     } while (!dc->is_jmp && !tcg_op_buf_full() &&
              !cs->singlestep_enabled &&
              !singlestep &&
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 0a4b363..bba7217 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3194,11 +3194,12 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(dc->pc);
+        num_insns++;
 
         /* Pretty disas.  */
         LOG_DIS("%8.8x:\t", dc->pc);
 
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
         dc->clear_x = 1;
@@ -3210,7 +3211,6 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
             cris_clear_x_flag(dc);
         }
 
-        num_insns++;
         /* Check for delayed branches here. If we do it before
            actually generating any host code, the simulator will just
            loop doing nothing for on this program location.  */
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 82d32e1..3d0c23d 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7960,12 +7960,13 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(pc_ptr);
+        num_insns++;
 
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
+        }
 
         pc_ptr = disas_insn(env, dc, pc_ptr);
-        num_insns++;
         /* stop translation if indicated */
         if (dc->is_jmp)
             break;
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index 84eeac3..a34914a 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1103,18 +1103,17 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(dc->pc);
+        num_insns++;
 
         /* Pretty disas.  */
         LOG_DIS("%8.8x:\t", dc->pc);
 
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
 
         decode(dc, cpu_ldl_code(env, dc->pc));
         dc->pc += 4;
-        num_insns++;
-
     } while (!dc->is_jmp
          && !tcg_op_buf_full()
          && !cs->singlestep_enabled
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index bfd9c00..422244e 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -3022,14 +3022,14 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(dc->pc);
+        num_insns++;
 
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
 
         dc->insn_pc = dc->pc;
 	disas_m68k_insn(env, dc);
-        num_insns++;
     } while (!dc->is_jmp && !tcg_op_buf_full() &&
              !cs->singlestep_enabled &&
              !singlestep &&
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index 02ccf45..a25b042 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1715,19 +1715,20 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
                         tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(dc->pc);
+        num_insns++;
 
         /* Pretty disas.  */
         LOG_DIS("%8.8x:\t", dc->pc);
 
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
+        }
 
         dc->clear_imm = 1;
         decode(dc, cpu_ldl_code(env, dc->pc));
         if (dc->clear_imm)
             dc->tb_flags &= ~IMM_FLAG;
         dc->pc += 4;
-        num_insns++;
 
         if (dc->delayed_branch) {
             dc->delayed_branch--;
diff --git a/target-mips/translate.c b/target-mips/translate.c
index aa0e0fd..66147d8 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -19619,8 +19619,9 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(ctx.pc);
+        num_insns++;
 
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
 
@@ -19659,8 +19660,6 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
         }
         ctx.pc += insn_bytes;
 
-        num_insns++;
-
         /* Execute a branch and its delay slot as a single instruction.
            This is what GDB expects and is consistent with what the
            hardware does (e.g. if a delay slot instruction faults, the
diff --git a/target-moxie/translate.c b/target-moxie/translate.c
index 1becfde..f71ed24 100644
--- a/target-moxie/translate.c
+++ b/target-moxie/translate.c
@@ -862,10 +862,10 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(ctx.pc);
+        num_insns++;
 
         ctx.opcode = cpu_lduw_code(env, ctx.pc);
         ctx.pc += decode_opc(cpu, &ctx);
-        num_insns++;
 
         if (cs->singlestep_enabled) {
             break;
diff --git a/target-openrisc/translate.c b/target-openrisc/translate.c
index 4f9b768..f9b4ed5 100644
--- a/target-openrisc/translate.c
+++ b/target-openrisc/translate.c
@@ -1688,8 +1688,9 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
             tcg_ctx.gen_opc_icount[k] = num_insns;
         }
         tcg_gen_insn_start(dc->pc);
+        num_insns++;
 
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
         dc->ppc = dc->pc - 4;
@@ -1698,7 +1699,6 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
         tcg_gen_movi_tl(cpu_npc, dc->npc);
         disas_openrisc_insn(dc, cpu);
         dc->pc = dc->npc;
-        num_insns++;
         /* delay slot */
         if (dc->delayed_branch) {
             dc->delayed_branch--;
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 6ca3e9f..7c288aa 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11503,11 +11503,12 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(ctx.nip);
+        num_insns++;
 
         LOG_DISAS("----------------\n");
         LOG_DISAS("nip=" TARGET_FMT_lx " super=%d ir=%d\n",
                   ctx.nip, ctx.mem_idx, (int)msr_ir);
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO))
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO))
             gen_io_start();
         if (unlikely(need_byteswap(&ctx))) {
             ctx.opcode = bswap32(cpu_ldl_code(env, ctx.nip));
@@ -11519,7 +11520,6 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
                     opc3(ctx.opcode), ctx.le_mode ? "little" : "big");
         ctx.nip += 4;
         table = env->opcodes;
-        num_insns++;
         handler = table[opc1(ctx.opcode)];
         if (is_indirect_opcode(handler)) {
             table = ind_table(handler);
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index 2767f6a..58cf365 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -5371,8 +5371,9 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(dc.pc);
+        num_insns++;
 
-        if (++num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
 
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index 1e43e6d..e0294e7 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1873,14 +1873,14 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_icount[ii] = num_insns;
         }
         tcg_gen_insn_start(ctx.pc);
+        num_insns++;
 
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
 
         ctx.opcode = cpu_lduw_code(env, ctx.pc);
 	decode_opc(&ctx);
-        num_insns++;
 	ctx.pc += 2;
 	if ((ctx.pc & (TARGET_PAGE_SIZE - 1)) == 0)
 	    break;
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index a47e65f..762eb9b 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -5268,8 +5268,9 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
             }
         }
         tcg_gen_insn_start(dc->pc);
+        num_insns++;
 
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
 
@@ -5277,7 +5278,6 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
         insn = cpu_ldl_code(env, dc->pc);
 
         disas_sparc_insn(dc, insn);
-        num_insns++;
 
         if (dc->is_br)
             break;
diff --git a/target-tilegx/translate.c b/target-tilegx/translate.c
index 6babc3c..c23b761 100644
--- a/target-tilegx/translate.c
+++ b/target-tilegx/translate.c
@@ -2097,6 +2097,7 @@ static inline void gen_intermediate_code_internal(TileGXCPU *cpu,
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(dc->pc);
+        num_insns++;
 
         translate_one_bundle(dc, cpu_ldq_data(env, dc->pc));
 
@@ -2105,7 +2106,7 @@ static inline void gen_intermediate_code_internal(TileGXCPU *cpu,
             break;
         }
         dc->pc += TILEGX_BUNDLE_SIZE_IN_BYTES;
-        if (++num_insns >= max_insns
+        if (num_insns >= max_insns
             || dc->pc >= next_page_start
             || tcg_op_buf_full()) {
             /* Ending the TB due to TB size or page boundary.  Set PC.  */
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index 27564d3..fa10d5c 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -8293,12 +8293,11 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
     gen_tb_start(tb);
     while (ctx.bstate == BS_NONE) {
         tcg_gen_insn_start(ctx.pc);
+        num_insns++;
 
         ctx.opcode = cpu_ldl_code(env, ctx.pc);
         decode_opc(env, &ctx, 0);
 
-        num_insns++;
-
         if (tcg_op_buf_full()) {
             gen_save_pc(ctx.next_pc);
             tcg_gen_exit_tb(0);
diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c
index 28db34a..7aad61f 100644
--- a/target-unicore32/translate.c
+++ b/target-unicore32/translate.c
@@ -1938,8 +1938,9 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
         tcg_gen_insn_start(dc->pc);
+        num_insns++;
 
-        if (num_insns + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
+        if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
 
@@ -1958,7 +1959,6 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
          * Otherwise the subsequent code could get translated several times.
          * Also stop translation when a page boundary is reached.  This
          * ensures prefetch aborts occur at the right place.  */
-        num_insns++;
     } while (!dc->is_jmp && !tcg_op_buf_full() &&
              !cs->singlestep_enabled &&
              !singlestep &&
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index ab9e8f9..3607e41 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -3077,10 +3077,11 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
             tcg_ctx.gen_opc_icount[lj] = insn_count;
         }
         tcg_gen_insn_start(dc.pc);
+        ++insn_count;
 
         ++dc.ccount_delta;
 
-        if (insn_count + 1 == max_insns && (tb->cflags & CF_LAST_IO)) {
+        if (insn_count == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
 
@@ -3101,7 +3102,6 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
         }
 
         disas_xtensa_insn(env, &dc);
-        ++insn_count;
         if (dc.icount) {
             tcg_gen_mov_i32(cpu_SR[ICOUNT], dc.next_icount);
         }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 04/25] target-*: Introduce and use cpu_breakpoint_test
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (2 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 03/25] target-*: Increment num_insns immediately after tcg_gen_insn_start Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-23 19:19   ` Peter Maydell
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 05/25] tcg: Allow extra data to be attached to insn_start Richard Henderson
                   ` (20 subsequent siblings)
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

Reduce the boilerplate required for each target.  At the same time,
move the test for breakpoint after calling tcg_gen_insn_start.

Note that arm and aarch64 do not use cpu_breakpoint_test, but still
move the inline test down after tcg_gen_insn_start.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 include/qom/cpu.h             | 16 ++++++++++++++++
 target-alpha/translate.c      | 13 ++++---------
 target-arm/translate-a64.c    | 26 +++++++++++++-------------
 target-arm/translate.c        | 31 ++++++++++++++++---------------
 target-cris/translate.c       | 27 ++++++++-------------------
 target-i386/translate.c       | 17 +++++++----------
 target-lm32/translate.c       | 25 +++++++------------------
 target-m68k/translate.c       | 18 ++++++------------
 target-microblaze/translate.c | 36 +++++++++++++-----------------------
 target-mips/translate.c       | 25 ++++++++++---------------
 target-moxie/translate.c      | 19 +++++++------------
 target-openrisc/translate.c   | 24 +++++++-----------------
 target-ppc/translate.c        | 14 +++++---------
 target-s390x/translate.c      | 16 ++++++----------
 target-sh4/translate.c        | 20 ++++++++------------
 target-sparc/translate.c      | 23 ++++++++++-------------
 target-unicore32/translate.c  | 24 ++++++++++--------------
 target-xtensa/translate.c     | 25 +++++++------------------
 18 files changed, 160 insertions(+), 239 deletions(-)

diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 302673d..e11dca3 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -679,6 +679,7 @@ void cpu_single_step(CPUState *cpu, int enabled);
 /* 0x08 currently unused */
 #define BP_GDB                0x10
 #define BP_CPU                0x20
+#define BP_ANY                (BP_GDB | BP_CPU)
 #define BP_WATCHPOINT_HIT_READ 0x40
 #define BP_WATCHPOINT_HIT_WRITE 0x80
 #define BP_WATCHPOINT_HIT (BP_WATCHPOINT_HIT_READ | BP_WATCHPOINT_HIT_WRITE)
@@ -689,6 +690,21 @@ int cpu_breakpoint_remove(CPUState *cpu, vaddr pc, int flags);
 void cpu_breakpoint_remove_by_ref(CPUState *cpu, CPUBreakpoint *breakpoint);
 void cpu_breakpoint_remove_all(CPUState *cpu, int mask);
 
+/* Return true if PC matches an installed breakpoint.  */
+static inline bool cpu_breakpoint_test(CPUState *cpu, vaddr pc, int mask)
+{
+    CPUBreakpoint *bp;
+
+    if (unlikely(!QTAILQ_EMPTY(&cpu->breakpoints))) {
+        QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) {
+            if (bp->pc == pc && (bp->flags & mask)) {
+                return true;
+            }
+        }
+    }
+    return false;
+}
+
 int cpu_watchpoint_insert(CPUState *cpu, vaddr addr, vaddr len,
                           int flags, CPUWatchpoint **watchpoint);
 int cpu_watchpoint_remove(CPUState *cpu, vaddr addr,
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index fa0ac2d..c10193e 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -2868,7 +2868,6 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
     target_ulong pc_start;
     target_ulong pc_mask;
     uint32_t insn;
-    CPUBreakpoint *bp;
     int j, lj = -1;
     ExitStatus ret;
     int num_insns;
@@ -2913,14 +2912,6 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
 
     gen_tb_start(tb);
     do {
-        if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-            QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-                if (bp->pc == ctx.pc) {
-                    gen_excp(&ctx, EXCP_DEBUG, 0);
-                    break;
-                }
-            }
-        }
         if (search_pc) {
             j = tcg_op_buf_count();
             if (lj < j) {
@@ -2936,6 +2927,10 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
         tcg_gen_insn_start(ctx.pc);
         num_insns++;
 
+        if (unlikely(cpu_breakpoint_test(cs, ctx.pc, BP_ANY))) {
+            gen_excp(&ctx, EXCP_DEBUG, 0);
+            break;
+        }
         if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 4670941..bc2040e 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -11007,7 +11007,6 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
     CPUState *cs = CPU(cpu);
     CPUARMState *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
-    CPUBreakpoint *bp;
     int j, lj;
     target_ulong pc_start;
     target_ulong next_page_start;
@@ -11079,18 +11078,6 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
     tcg_clear_temp_count();
 
     do {
-        if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-            QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-                if (bp->pc == dc->pc) {
-                    gen_exception_internal_insn(dc, 0, EXCP_DEBUG);
-                    /* Advance PC so that clearing the breakpoint will
-                       invalidate this TB.  */
-                    dc->pc += 2;
-                    goto done_generating;
-                }
-            }
-        }
-
         if (search_pc) {
             j = tcg_op_buf_count();
             if (lj < j) {
@@ -11106,6 +11093,19 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
+        if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
+            CPUBreakpoint *bp;
+            QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
+                if (bp->pc == dc->pc) {
+                    gen_exception_internal_insn(dc, 0, EXCP_DEBUG);
+                    /* Advance PC so that clearing the breakpoint will
+                       invalidate this TB.  */
+                    dc->pc += 2;
+                    goto done_generating;
+                }
+            }
+        }
+
         if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
diff --git a/target-arm/translate.c b/target-arm/translate.c
index cd88997..44468dc 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11177,7 +11177,6 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
     CPUState *cs = CPU(cpu);
     CPUARMState *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
-    CPUBreakpoint *bp;
     int j, lj;
     target_ulong pc_start;
     target_ulong next_page_start;
@@ -11306,6 +11305,21 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
         store_cpu_field(tmp, condexec_bits);
       }
     do {
+        if (search_pc) {
+            j = tcg_op_buf_count();
+            if (lj < j) {
+                lj++;
+                while (lj < j)
+                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
+            }
+            tcg_ctx.gen_opc_pc[lj] = dc->pc;
+            gen_opc_condexec_bits[lj] = (dc->condexec_cond << 4) | (dc->condexec_mask >> 1);
+            tcg_ctx.gen_opc_instr_start[lj] = 1;
+            tcg_ctx.gen_opc_icount[lj] = num_insns;
+        }
+        tcg_gen_insn_start(dc->pc);
+        num_insns++;
+
 #ifdef CONFIG_USER_ONLY
         /* Intercept jump to the magic kernel page.  */
         if (dc->pc >= 0xffff0000) {
@@ -11326,6 +11340,7 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
 #endif
 
         if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
+            CPUBreakpoint *bp;
             QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
                 if (bp->pc == dc->pc) {
                     gen_exception_internal_insn(dc, 0, EXCP_DEBUG);
@@ -11336,20 +11351,6 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
                 }
             }
         }
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j)
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-            }
-            tcg_ctx.gen_opc_pc[lj] = dc->pc;
-            gen_opc_condexec_bits[lj] = (dc->condexec_cond << 4) | (dc->condexec_mask >> 1);
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
-        tcg_gen_insn_start(dc->pc);
-        num_insns++;
 
         if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
diff --git a/target-cris/translate.c b/target-cris/translate.c
index bba7217..477bddc 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3030,23 +3030,6 @@ static unsigned int crisv32_decoder(CPUCRISState *env, DisasContext *dc)
     return insn_len;
 }
 
-static void check_breakpoint(CPUCRISState *env, DisasContext *dc)
-{
-    CPUState *cs = CPU(cris_env_get_cpu(env));
-    CPUBreakpoint *bp;
-
-    if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-        QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-            if (bp->pc == dc->pc) {
-                cris_evaluate_flags(dc);
-                tcg_gen_movi_tl(env_pc, dc->pc);
-                t_gen_raise_exception(EXCP_DEBUG);
-                dc->is_jmp = DISAS_UPDATE;
-            }
-        }
-    }
-}
-
 #include "translate_v10.c"
 
 /*
@@ -3175,8 +3158,6 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
 
     gen_tb_start(tb);
     do {
-        check_breakpoint(env, dc);
-
         if (search_pc) {
             j = tcg_op_buf_count();
             if (lj < j) {
@@ -3196,6 +3177,14 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
+        if (unlikely(cpu_breakpoint_test(cs, dc->pc, BP_ANY))) {
+            cris_evaluate_flags(dc);
+            tcg_gen_movi_tl(env_pc, dc->pc);
+            t_gen_raise_exception(EXCP_DEBUG);
+            dc->is_jmp = DISAS_UPDATE;
+            break;
+        }
+
         /* Pretty disas.  */
         LOG_DIS("%8.8x:\t", dc->pc);
 
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 3d0c23d..9ec9c4c 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7849,7 +7849,6 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
     CPUX86State *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
     target_ulong pc_ptr;
-    CPUBreakpoint *bp;
     int j, lj;
     uint64_t flags;
     target_ulong pc_start;
@@ -7938,15 +7937,6 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
 
     gen_tb_start(tb);
     for(;;) {
-        if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-            QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-                if (bp->pc == pc_ptr &&
-                    !((bp->flags & BP_CPU) && (tb->flags & HF_RF_MASK))) {
-                    gen_debug(dc, pc_ptr - dc->cs_base);
-                    goto done_generating;
-                }
-            }
-        }
         if (search_pc) {
             j = tcg_op_buf_count();
             if (lj < j) {
@@ -7962,6 +7952,13 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
         tcg_gen_insn_start(pc_ptr);
         num_insns++;
 
+        /* If RF is set, suppress an internally generated breakpoint.  */
+        if (unlikely(cpu_breakpoint_test(cs, pc_ptr,
+                                         tb->flags & HF_RF_MASK
+                                         ? BP_GDB : BP_ANY))) {
+            gen_debug(dc, pc_ptr - dc->cs_base);
+            goto done_generating;
+        }
         if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index a34914a..8ea7929 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1032,22 +1032,6 @@ static inline void decode(DisasContext *dc, uint32_t ir)
     decinfo[dc->opcode](dc);
 }
 
-static void check_breakpoint(CPULM32State *env, DisasContext *dc)
-{
-    CPUState *cs = CPU(lm32_env_get_cpu(env));
-    CPUBreakpoint *bp;
-
-    if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-        QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-            if (bp->pc == dc->pc) {
-                tcg_gen_movi_tl(cpu_pc, dc->pc);
-                t_gen_raise_exception(dc, EXCP_DEBUG);
-                dc->is_jmp = DISAS_UPDATE;
-             }
-        }
-    }
-}
-
 /* generate intermediate code for basic block 'tb'.  */
 static inline
 void gen_intermediate_code_internal(LM32CPU *cpu,
@@ -1088,8 +1072,6 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
 
     gen_tb_start(tb);
     do {
-        check_breakpoint(env, dc);
-
         if (search_pc) {
             j = tcg_op_buf_count();
             if (lj < j) {
@@ -1105,6 +1087,13 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
+        if (unlikely(cpu_breakpoint_test(cs, dc->pc, BP_ANY))) {
+            tcg_gen_movi_tl(cpu_pc, dc->pc);
+            t_gen_raise_exception(dc, EXCP_DEBUG);
+            dc->is_jmp = DISAS_UPDATE;
+            break;
+        }
+
         /* Pretty disas.  */
         LOG_DIS("%8.8x:\t", dc->pc);
 
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 422244e..afef37f 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2969,7 +2969,6 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
     CPUState *cs = CPU(cpu);
     CPUM68KState *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
-    CPUBreakpoint *bp;
     int j, lj;
     target_ulong pc_start;
     int pc_offset;
@@ -2999,17 +2998,6 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
     do {
         pc_offset = dc->pc - pc_start;
         gen_throws_exception = NULL;
-        if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-            QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-                if (bp->pc == dc->pc) {
-                    gen_exception(dc, dc->pc, EXCP_DEBUG);
-                    dc->is_jmp = DISAS_JUMP;
-                    break;
-                }
-            }
-            if (dc->is_jmp)
-                break;
-        }
         if (search_pc) {
             j = tcg_op_buf_count();
             if (lj < j) {
@@ -3024,6 +3012,12 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
+        if (unlikely(cpu_breakpoint_test(cs, dc->pc, BP_ANY))) {
+            gen_exception(dc, dc->pc, EXCP_DEBUG);
+            dc->is_jmp = DISAS_JUMP;
+            break;
+        }
+
         if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index a25b042..1224456 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1626,21 +1626,6 @@ static inline void decode(DisasContext *dc, uint32_t ir)
     }
 }
 
-static void check_breakpoint(CPUMBState *env, DisasContext *dc)
-{
-    CPUState *cs = CPU(mb_env_get_cpu(env));
-    CPUBreakpoint *bp;
-
-    if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-        QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-            if (bp->pc == dc->pc) {
-                t_gen_raise_exception(dc, EXCP_DEBUG);
-                dc->is_jmp = DISAS_UPDATE;
-             }
-        }
-    }
-}
-
 /* generate intermediate code for basic block 'tb'.  */
 static inline void
 gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
@@ -1695,14 +1680,6 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
     gen_tb_start(tb);
     do
     {
-#if SIM_COMPAT
-        if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
-            tcg_gen_movi_tl(cpu_SR[SR_PC], dc->pc);
-            gen_helper_debug();
-        }
-#endif
-        check_breakpoint(env, dc);
-
         if (search_pc) {
             j = tcg_op_buf_count();
             if (lj < j) {
@@ -1717,6 +1694,19 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
+#if SIM_COMPAT
+        if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
+            tcg_gen_movi_tl(cpu_SR[SR_PC], dc->pc);
+            gen_helper_debug();
+        }
+#endif
+
+        if (unlikely(cpu_breakpoint_test(cs, dc->pc, BP_ANY))) {
+            t_gen_raise_exception(dc, EXCP_DEBUG);
+            dc->is_jmp = DISAS_UPDATE;
+            break;
+        }
+
         /* Pretty disas.  */
         LOG_DIS("%8.8x:\t", dc->pc);
 
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 66147d8..57e826d 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -19544,7 +19544,6 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
     DisasContext ctx;
     target_ulong pc_start;
     target_ulong next_page_start;
-    CPUBreakpoint *bp;
     int j, lj = -1;
     int num_insns;
     int max_insns;
@@ -19591,20 +19590,6 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
     LOG_DISAS("\ntb %p idx %d hflags %04x\n", tb, ctx.mem_idx, ctx.hflags);
     gen_tb_start(tb);
     while (ctx.bstate == BS_NONE) {
-        if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-            QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-                if (bp->pc == ctx.pc) {
-                    save_cpu_state(&ctx, 1);
-                    ctx.bstate = BS_BRANCH;
-                    gen_helper_raise_exception_debug(cpu_env);
-                    /* Include the breakpoint location or the tb won't
-                     * be flushed when it must be.  */
-                    ctx.pc += 4;
-                    goto done_generating;
-                }
-            }
-        }
-
         if (search_pc) {
             j = tcg_op_buf_count();
             if (lj < j) {
@@ -19621,6 +19606,16 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
         tcg_gen_insn_start(ctx.pc);
         num_insns++;
 
+        if (unlikely(cpu_breakpoint_test(cs, ctx.pc, BP_ANY))) {
+            save_cpu_state(&ctx, 1);
+            ctx.bstate = BS_BRANCH;
+            gen_helper_raise_exception_debug(cpu_env);
+            /* Include the breakpoint location or the tb won't
+             * be flushed when it must be.  */
+            ctx.pc += 4;
+            goto done_generating;
+        }
+
         if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
diff --git a/target-moxie/translate.c b/target-moxie/translate.c
index f71ed24..d71f55b 100644
--- a/target-moxie/translate.c
+++ b/target-moxie/translate.c
@@ -822,7 +822,6 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
     CPUState *cs = CPU(cpu);
     DisasContext ctx;
     target_ulong pc_start;
-    CPUBreakpoint *bp;
     int j, lj = -1;
     CPUMoxieState *env = &cpu->env;
     int num_insns;
@@ -838,17 +837,6 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
 
     gen_tb_start(tb);
     do {
-        if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-            QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-                if (ctx.pc == bp->pc) {
-                    tcg_gen_movi_i32(cpu_pc, ctx.pc);
-                    gen_helper_debug(cpu_env);
-                    ctx.bstate = BS_EXCP;
-                    goto done_generating;
-                }
-            }
-        }
-
         if (search_pc) {
             j = tcg_op_buf_count();
             if (lj < j) {
@@ -864,6 +852,13 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
         tcg_gen_insn_start(ctx.pc);
         num_insns++;
 
+        if (unlikely(cpu_breakpoint_test(cs, ctx.pc, BP_ANY))) {
+            tcg_gen_movi_i32(cpu_pc, ctx.pc);
+            gen_helper_debug(cpu_env);
+            ctx.bstate = BS_EXCP;
+            goto done_generating;
+        }
+
         ctx.opcode = cpu_lduw_code(env, ctx.pc);
         ctx.pc += decode_opc(cpu, &ctx);
 
diff --git a/target-openrisc/translate.c b/target-openrisc/translate.c
index f9b4ed5..9755850 100644
--- a/target-openrisc/translate.c
+++ b/target-openrisc/translate.c
@@ -1618,22 +1618,6 @@ static void disas_openrisc_insn(DisasContext *dc, OpenRISCCPU *cpu)
     }
 }
 
-static void check_breakpoint(OpenRISCCPU *cpu, DisasContext *dc)
-{
-    CPUState *cs = CPU(cpu);
-    CPUBreakpoint *bp;
-
-    if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-        QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-            if (bp->pc == dc->pc) {
-                tcg_gen_movi_tl(cpu_pc, dc->pc);
-                gen_exception(dc, EXCP_DEBUG);
-                dc->is_jmp = DISAS_UPDATE;
-            }
-        }
-    }
-}
-
 static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
                                                   TranslationBlock *tb,
                                                   int search_pc)
@@ -1674,7 +1658,6 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
     gen_tb_start(tb);
 
     do {
-        check_breakpoint(cpu, dc);
         if (search_pc) {
             j = tcg_op_buf_count();
             if (k < j) {
@@ -1690,6 +1673,13 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
+        if (unlikely(cpu_breakpoint_test(cs, dc->pc, BP_ANY))) {
+            tcg_gen_movi_tl(cpu_pc, dc->pc);
+            gen_exception(dc, EXCP_DEBUG);
+            dc->is_jmp = DISAS_UPDATE;
+            break;
+        }
+
         if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 7c288aa..fc234a3 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11418,7 +11418,6 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
     DisasContext ctx, *ctxp = &ctx;
     opc_handler_t **table, *handler;
     target_ulong pc_start;
-    CPUBreakpoint *bp;
     int j, lj = -1;
     int num_insns;
     int max_insns;
@@ -11483,14 +11482,6 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
     tcg_clear_temp_count();
     /* Set env in case of segfault during code fetch */
     while (ctx.exception == POWERPC_EXCP_NONE && !tcg_op_buf_full()) {
-        if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-            QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-                if (bp->pc == ctx.nip) {
-                    gen_debug_exception(ctxp);
-                    break;
-                }
-            }
-        }
         if (unlikely(search_pc)) {
             j = tcg_op_buf_count();
             if (lj < j) {
@@ -11505,6 +11496,11 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
         tcg_gen_insn_start(ctx.nip);
         num_insns++;
 
+        if (unlikely(cpu_breakpoint_test(cs, ctx.nip, BP_ANY))) {
+            gen_debug_exception(ctxp);
+            break;
+        }
+
         LOG_DISAS("----------------\n");
         LOG_DISAS("nip=" TARGET_FMT_lx " super=%d ir=%d\n",
                   ctx.nip, ctx.mem_idx, (int)msr_ir);
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index 58cf365..4959828 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -5330,7 +5330,6 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
     uint64_t next_page_start;
     int j, lj = -1;
     int num_insns, max_insns;
-    CPUBreakpoint *bp;
     ExitStatus status;
     bool do_debug;
 
@@ -5373,20 +5372,17 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
         tcg_gen_insn_start(dc.pc);
         num_insns++;
 
+        if (unlikely(cpu_breakpoint_test(cs, dc.pc, BP_ANY))) {
+            status = EXIT_PC_STALE;
+            do_debug = true;
+            break;
+        }
+
         if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
 
         status = NO_EXIT;
-        if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-            QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-                if (bp->pc == dc.pc) {
-                    status = EXIT_PC_STALE;
-                    do_debug = true;
-                    break;
-                }
-            }
-        }
         if (status == NO_EXIT) {
             status = translate_one(env, &dc);
         }
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index e0294e7..53bf9e8 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1824,7 +1824,6 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
     CPUSH4State *env = &cpu->env;
     DisasContext ctx;
     target_ulong pc_start;
-    CPUBreakpoint *bp;
     int i, ii;
     int num_insns;
     int max_insns;
@@ -1849,17 +1848,6 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
         max_insns = CF_COUNT_MASK;
     gen_tb_start(tb);
     while (ctx.bstate == BS_NONE && !tcg_op_buf_full()) {
-        if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-            QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-                if (ctx.pc == bp->pc) {
-		    /* We have hit a breakpoint - make sure PC is up-to-date */
-		    tcg_gen_movi_i32(cpu_pc, ctx.pc);
-                    gen_helper_debug(cpu_env);
-                    ctx.bstate = BS_BRANCH;
-		    break;
-		}
-	    }
-	}
         if (search_pc) {
             i = tcg_op_buf_count();
             if (ii < i) {
@@ -1875,6 +1863,14 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
         tcg_gen_insn_start(ctx.pc);
         num_insns++;
 
+        if (unlikely(cpu_breakpoint_test(cs, ctx.pc, BP_ANY))) {
+            /* We have hit a breakpoint - make sure PC is up-to-date */
+            tcg_gen_movi_i32(cpu_pc, ctx.pc);
+            gen_helper_debug(cpu_env);
+            ctx.bstate = BS_BRANCH;
+            break;
+        }
+
         if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 762eb9b..f359ac9 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -5217,7 +5217,6 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
     CPUSPARCState *env = &cpu->env;
     target_ulong pc_start, last_pc;
     DisasContext dc1, *dc = &dc1;
-    CPUBreakpoint *bp;
     int j, lj = -1;
     int num_insns;
     int max_insns;
@@ -5242,18 +5241,6 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
         max_insns = CF_COUNT_MASK;
     gen_tb_start(tb);
     do {
-        if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-            QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-                if (bp->pc == dc->pc) {
-                    if (dc->pc != pc_start)
-                        save_state(dc);
-                    gen_helper_debug(cpu_env);
-                    tcg_gen_exit_tb(0);
-                    dc->is_br = 1;
-                    goto exit_gen_loop;
-                }
-            }
-        }
         if (spc) {
             qemu_log("Search PC...\n");
             j = tcg_op_buf_count();
@@ -5270,6 +5257,16 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
+        if (unlikely(cpu_breakpoint_test(cs, dc->pc, BP_ANY))) {
+            if (dc->pc != pc_start) {
+                save_state(dc);
+            }
+            gen_helper_debug(cpu_env);
+            tcg_gen_exit_tb(0);
+            dc->is_br = 1;
+            goto exit_gen_loop;
+        }
+
         if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c
index 7aad61f..cd23c4b 100644
--- a/target-unicore32/translate.c
+++ b/target-unicore32/translate.c
@@ -1872,7 +1872,6 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
     CPUState *cs = CPU(cpu);
     CPUUniCore32State *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
-    CPUBreakpoint *bp;
     int j, lj;
     target_ulong pc_start;
     uint32_t next_page_start;
@@ -1912,19 +1911,6 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
 
     gen_tb_start(tb);
     do {
-        if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-            QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-                if (bp->pc == dc->pc) {
-                    gen_set_pc_im(dc->pc);
-                    gen_exception(EXCP_DEBUG);
-                    dc->is_jmp = DISAS_JUMP;
-                    /* Advance PC so that clearing the breakpoint will
-                       invalidate this TB.  */
-                    dc->pc += 2; /* FIXME */
-                    goto done_generating;
-                }
-            }
-        }
         if (search_pc) {
             j = tcg_op_buf_count();
             if (lj < j) {
@@ -1940,6 +1926,16 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
+        if (unlikely(cpu_breakpoint_test(cs, dc->pc, BP_ANY))) {
+            gen_set_pc_im(dc->pc);
+            gen_exception(EXCP_DEBUG);
+            dc->is_jmp = DISAS_JUMP;
+            /* Advance PC so that clearing the breakpoint will
+               invalidate this TB.  */
+            dc->pc += 2; /* FIXME */
+            goto done_generating;
+        }
+
         if (num_insns == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index 3607e41..ea87cb5 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -2984,22 +2984,6 @@ static inline unsigned xtensa_insn_len(CPUXtensaState *env, DisasContext *dc)
     return xtensa_op0_insn_len(OP0);
 }
 
-static void check_breakpoint(CPUXtensaState *env, DisasContext *dc)
-{
-    CPUState *cs = CPU(xtensa_env_get_cpu(env));
-    CPUBreakpoint *bp;
-
-    if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
-        QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
-            if (bp->pc == dc->pc) {
-                tcg_gen_movi_i32(cpu_pc, dc->pc);
-                gen_exception(dc, EXCP_DEBUG);
-                dc->is_jmp = DISAS_UPDATE;
-             }
-        }
-    }
-}
-
 static void gen_ibreak_check(CPUXtensaState *env, DisasContext *dc)
 {
     unsigned i;
@@ -3062,8 +3046,6 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
     }
 
     do {
-        check_breakpoint(env, &dc);
-
         if (search_pc) {
             j = tcg_op_buf_count();
             if (lj < j) {
@@ -3081,6 +3063,13 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
 
         ++dc.ccount_delta;
 
+        if (unlikely(cpu_breakpoint_test(cs, dc.pc, BP_ANY))) {
+            tcg_gen_movi_i32(cpu_pc, dc.pc);
+            gen_exception(&dc, EXCP_DEBUG);
+            dc.is_jmp = DISAS_UPDATE;
+            break;
+        }
+
         if (insn_count == max_insns && (tb->cflags & CF_LAST_IO)) {
             gen_io_start();
         }
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 05/25] tcg: Allow extra data to be attached to insn_start
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (3 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 04/25] target-*: Introduce and use cpu_breakpoint_test Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-23 14:55   ` Kevin O'Connor
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 06/25] target-arm: Add condexec state " Richard Henderson
                   ` (19 subsequent siblings)
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

With an eye toward having this data replace the gen_opc_* arrays
that each target collects in order to enable restore_state_from_tb.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/tcg-op.h  | 52 ++++++++++++++++++++++++++++++++++++++++++++--------
 tcg/tcg-opc.h |  4 ++--
 tcg/tcg.c     | 13 +++++++------
 tcg/tcg.h     |  6 ++++++
 4 files changed, 59 insertions(+), 16 deletions(-)

diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index 6409db8..4e20dc1 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -700,17 +700,53 @@ static inline void tcg_gen_concat32_i64(TCGv_i64 ret, TCGv_i64 lo, TCGv_i64 hi)
 #error must include QEMU headers
 #endif
 
-/* debug info: write the PC of the corresponding QEMU CPU instruction */
-static inline void tcg_gen_insn_start(uint64_t pc)
+#if TARGET_INSN_START_WORDS == 1
+# if TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
+static inline void tcg_gen_insn_start(target_ulong pc)
 {
-    /* XXX: must really use a 32 bit size for TCGArg in all cases */
-#if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
-    tcg_gen_op2ii(INDEX_op_insn_start,
-                  (uint32_t)(pc), (uint32_t)(pc >> 32));
+    tcg_gen_op1(&tcg_ctx, INDEX_op_insn_start, pc);
+}
+# else
+static inline void tcg_gen_insn_start(target_ulong pc)
+{
+    tcg_gen_op2(&tcg_ctx, INDEX_op_insn_start,
+                (uint32_t)pc, (uint32_t)(pc >> 32));
+}
+# endif
+#elif TARGET_INSN_START_WORDS == 2
+# if TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
+static inline void tcg_gen_insn_start(target_ulong pc, target_ulong a1)
+{
+    tcg_gen_op2(&tcg_ctx, INDEX_op_insn_start, pc, a1);
+}
+# else
+static inline void tcg_gen_insn_start(target_ulong pc, target_ulong a1)
+{
+    tcg_gen_op4(&tcg_ctx, INDEX_op_insn_start,
+                (uint32_t)pc, (uint32_t)(pc >> 32),
+                (uint32_t)a1, (uint32_t)(a1 >> 32));
+}
+# endif
+#elif TARGET_INSN_START_WORDS == 3
+# if TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
+static inline void tcg_gen_insn_start(target_ulong pc, target_ulong a1,
+                                      target_ulong a2)
+{
+    tcg_gen_op3(&tcg_ctx, INDEX_op_insn_start, pc, a1, a2);
+}
+# else
+static inline void tcg_gen_insn_start(target_ulong pc, target_ulong a1,
+                                      target_ulong a2)
+{
+    tcg_gen_op6(&tcg_ctx, INDEX_op_insn_start,
+                (uint32_t)pc, (uint32_t)(pc >> 32),
+                (uint32_t)a1, (uint32_t)(a1 >> 32),
+                (uint32_t)a2, (uint32_t)(a2 >> 32));
+}
+# endif
 #else
-    tcg_gen_op1i(INDEX_op_insn_start, pc);
+# error "Unhandled number of operands to insn_start"
 #endif
-}
 
 static inline void tcg_gen_exit_tb(uintptr_t val)
 {
diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h
index f60d3c2..c6f9570 100644
--- a/tcg/tcg-opc.h
+++ b/tcg/tcg-opc.h
@@ -175,9 +175,9 @@ DEF(mulsh_i64, 1, 2, 0, IMPL(TCG_TARGET_HAS_mulsh_i64))
 
 /* QEMU specific */
 #if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
-DEF(insn_start, 0, 0, 2, TCG_OPF_NOT_PRESENT)
+DEF(insn_start, 0, 0, 2 * TARGET_INSN_START_WORDS, TCG_OPF_NOT_PRESENT)
 #else
-DEF(insn_start, 0, 0, 1, TCG_OPF_NOT_PRESENT)
+DEF(insn_start, 0, 0, TARGET_INSN_START_WORDS, TCG_OPF_NOT_PRESENT)
 #endif
 DEF(exit_tb, 0, 0, 1, TCG_OPF_BB_END)
 DEF(goto_tb, 0, 0, 1, TCG_OPF_BB_END)
diff --git a/tcg/tcg.c b/tcg/tcg.c
index df8788b..3308d68 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -991,16 +991,17 @@ void tcg_dump_ops(TCGContext *s)
         args = &s->gen_opparam_buf[op->args];
 
         if (c == INDEX_op_insn_start) {
-            uint64_t pc;
+            qemu_log("%s ----", oi != s->gen_first_op_idx ? "\n" : "");
+
+            for (i = 0; i < TARGET_INSN_START_WORDS; ++i) {
+                target_ulong a;
 #if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
-            pc = ((uint64_t)args[1] << 32) | args[0];
+                a = ((target_ulong)args[i * 2 + 1] << 32) | args[i * 2];
 #else
-            pc = args[0];
+                a = args[i];
 #endif
-            if (oi != s->gen_first_op_idx) {
-                qemu_log("\n");
+                qemu_log(" " TARGET_FMT_lx, a);
             }
-            qemu_log(" ---- 0x%" PRIx64, pc);
         } else if (c == INDEX_op_call) {
             /* variable number of arguments */
             nb_oargs = op->callo;
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 879a665..c975076 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -129,6 +129,12 @@ typedef uint64_t TCGRegSet;
 # error "Missing unsigned widening multiply"
 #endif
 
+#ifndef TARGET_INSN_START_EXTRA_WORDS
+# define TARGET_INSN_START_WORDS 1
+#else
+# define TARGET_INSN_START_WORDS (1 + TARGET_INSN_START_EXTRA_WORDS)
+#endif
+
 typedef enum TCGOpcode {
 #define DEF(name, oargs, iargs, cargs, flags) INDEX_op_ ## name,
 #include "tcg-opc.h"
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 06/25] target-arm: Add condexec state to insn_start
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (4 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 05/25] tcg: Allow extra data to be attached to insn_start Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 07/25] target-i386: Add cc_op " Richard Henderson
                   ` (18 subsequent siblings)
  24 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-arm/cpu.h           | 1 +
 target-arm/translate-a64.c | 2 +-
 target-arm/translate.c     | 3 ++-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 1b80516..c4a7400 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -97,6 +97,7 @@
 struct arm_boot_info;
 
 #define NB_MMU_MODES 7
+#define TARGET_INSN_START_EXTRA_WORDS 1
 
 /* We currently assume float and double are IEEE single and double
    precision respectively.
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index bc2040e..654a586 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -11090,7 +11090,7 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
-        tcg_gen_insn_start(dc->pc);
+        tcg_gen_insn_start(dc->pc, 0);
         num_insns++;
 
         if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 44468dc..fb69ecb 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11317,7 +11317,8 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
-        tcg_gen_insn_start(dc->pc);
+        tcg_gen_insn_start(dc->pc,
+                           (dc->condexec_cond << 4) | (dc->condexec_mask >> 1));
         num_insns++;
 
 #ifdef CONFIG_USER_ONLY
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 07/25] target-i386: Add cc_op state to insn_start
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (5 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 06/25] target-arm: Add condexec state " Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 08/25] target-mips: Add delayed branch " Richard Henderson
                   ` (17 subsequent siblings)
  24 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-i386/cpu.h       | 1 +
 target-i386/translate.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 5231e8c..717d558 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -794,6 +794,7 @@ typedef struct {
 #define MAX_GP_COUNTERS    (MSR_IA32_PERF_STATUS - MSR_P6_EVNTSEL0)
 
 #define NB_MMU_MODES 3
+#define TARGET_INSN_START_EXTRA_WORDS 1
 
 #define NB_OPMASK_REGS 8
 
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 9ec9c4c..7501b91 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7949,7 +7949,7 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
-        tcg_gen_insn_start(pc_ptr);
+        tcg_gen_insn_start(pc_ptr, dc->cc_op);
         num_insns++;
 
         /* If RF is set, suppress an internally generated breakpoint.  */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 08/25] target-mips: Add delayed branch state to insn_start
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (6 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 07/25] target-i386: Add cc_op " Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 09/25] target-s390x: Add cc_op " Richard Henderson
                   ` (16 subsequent siblings)
  24 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-mips/cpu.h       | 1 +
 target-mips/translate.c | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/target-mips/cpu.h b/target-mips/cpu.h
index ed7d86d..fd23832 100644
--- a/target-mips/cpu.h
+++ b/target-mips/cpu.h
@@ -132,6 +132,7 @@ struct CPUMIPSFPUContext {
 };
 
 #define NB_MMU_MODES 3
+#define TARGET_INSN_START_EXTRA_WORDS 2
 
 typedef struct CPUMIPSMVPContext CPUMIPSMVPContext;
 struct CPUMIPSMVPContext {
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 57e826d..30d7d46 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -19562,6 +19562,7 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
     ctx.CP0_Config1 = env->CP0_Config1;
     ctx.tb = tb;
     ctx.bstate = BS_NONE;
+    ctx.btarget = 0;
     ctx.kscrexist = (env->CP0_Config4 >> CP0C4_KScrExist) & 0xff;
     ctx.rxi = (env->CP0_Config3 >> CP0C3_RXI) & 1;
     ctx.ie = (env->CP0_Config4 >> CP0C4_IE) & 3;
@@ -19603,7 +19604,7 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
-        tcg_gen_insn_start(ctx.pc);
+        tcg_gen_insn_start(ctx.pc, ctx.hflags & MIPS_HFLAG_BMASK, ctx.btarget);
         num_insns++;
 
         if (unlikely(cpu_breakpoint_test(cs, ctx.pc, BP_ANY))) {
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 09/25] target-s390x: Add cc_op state to insn_start
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (7 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 08/25] target-mips: Add delayed branch " Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 10/25] target-sh4: Add flags " Richard Henderson
                   ` (15 subsequent siblings)
  24 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-s390x/cpu.h       | 1 +
 target-s390x/translate.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
index 9aeb024..68d6528 100644
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -43,6 +43,7 @@
 #include "fpu/softfloat.h"
 
 #define NB_MMU_MODES 3
+#define TARGET_INSN_START_EXTRA_WORDS 1
 
 #define MMU_MODE0_SUFFIX _primary
 #define MMU_MODE1_SUFFIX _secondary
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index 4959828..6bbc760 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -5369,7 +5369,7 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
-        tcg_gen_insn_start(dc.pc);
+        tcg_gen_insn_start(dc.pc, dc.cc_op);
         num_insns++;
 
         if (unlikely(cpu_breakpoint_test(cs, dc.pc, BP_ANY))) {
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 10/25] target-sh4: Add flags state to insn_start
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (8 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 09/25] target-s390x: Add cc_op " Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 11/25] target-cris: Mirror gen_opc_pc into insn_start Richard Henderson
                   ` (14 subsequent siblings)
  24 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sh4/cpu.h       | 1 +
 target-sh4/translate.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/target-sh4/cpu.h b/target-sh4/cpu.h
index 1f68b27..ea854cb 100644
--- a/target-sh4/cpu.h
+++ b/target-sh4/cpu.h
@@ -122,6 +122,7 @@ typedef struct tlb_t {
 #define ITLB_SIZE 4
 
 #define NB_MMU_MODES 2
+#define TARGET_INSN_START_EXTRA_WORDS 1
 
 enum sh_features {
     SH_FEATURE_SH4A = 1,
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index 53bf9e8..efaa6f6 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1860,7 +1860,7 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_instr_start[ii] = 1;
             tcg_ctx.gen_opc_icount[ii] = num_insns;
         }
-        tcg_gen_insn_start(ctx.pc);
+        tcg_gen_insn_start(ctx.pc, ctx.flags);
         num_insns++;
 
         if (unlikely(cpu_breakpoint_test(cs, ctx.pc, BP_ANY))) {
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 11/25] target-cris: Mirror gen_opc_pc into insn_start
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (9 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 10/25] target-sh4: Add flags " Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 12/25] target-sparc: Tidy gen_branch_a interface Richard Henderson
                   ` (13 subsequent siblings)
  24 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

This perhaps isn't ideal in terms of (ab)using the "pc" field
to encode both pc and ppc + delay branch state, as one has to
be aware of this when examining opcode dumps.

But it preserves existing logic, which will be good for bisection,
and it certainly does save storage space.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-cris/translate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target-cris/translate.c b/target-cris/translate.c
index 477bddc..3d55a6a 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3174,7 +3174,8 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
             tcg_ctx.gen_opc_instr_start[lj] = 1;
             tcg_ctx.gen_opc_icount[lj] = num_insns;
         }
-        tcg_gen_insn_start(dc->pc);
+        tcg_gen_insn_start(dc->delayed_branch == 1
+                           ? dc->ppc | 1 : dc->pc);
         num_insns++;
 
         if (unlikely(cpu_breakpoint_test(cs, dc->pc, BP_ANY))) {
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 12/25] target-sparc: Tidy gen_branch_a interface
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (10 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 11/25] target-cris: Mirror gen_opc_pc into insn_start Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-22 21:23   ` Aurelien Jarno
  2015-09-24 19:42   ` Aurelien Jarno
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 13/25] target-sparc: Split out gen_branch_n Richard Henderson
                   ` (12 subsequent siblings)
  24 siblings, 2 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

We always pass pc2 == dc->npc and r_cond == cpu_cond,
and always set is_br afterward.  Infer all of that.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index f359ac9..cbc90d8 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -955,17 +955,19 @@ static inline void gen_branch2(DisasContext *dc, target_ulong pc1,
     gen_goto_tb(dc, 1, pc2, pc2 + 4);
 }
 
-static inline void gen_branch_a(DisasContext *dc, target_ulong pc1,
-                                target_ulong pc2, TCGv r_cond)
+static void gen_branch_a(DisasContext *dc, target_ulong pc1)
 {
     TCGLabel *l1 = gen_new_label();
+    target_ulong npc = dc->npc;
 
-    tcg_gen_brcondi_tl(TCG_COND_EQ, r_cond, 0, l1);
+    tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_cond, 0, l1);
 
-    gen_goto_tb(dc, 0, pc2, pc1);
+    gen_goto_tb(dc, 0, npc, pc1);
 
     gen_set_label(l1);
-    gen_goto_tb(dc, 1, pc2 + 4, pc2 + 8);
+    gen_goto_tb(dc, 1, npc + 4, npc + 8);
+
+    dc->is_br = 1;
 }
 
 static inline void gen_generic_branch(DisasContext *dc)
@@ -1398,8 +1400,7 @@ static void do_branch(DisasContext *dc, int32_t offset, uint32_t insn, int cc)
         flush_cond(dc);
         gen_cond(cpu_cond, cc, cond, dc);
         if (a) {
-            gen_branch_a(dc, target, dc->npc, cpu_cond);
-            dc->is_br = 1;
+            gen_branch_a(dc, target);
         } else {
             dc->pc = dc->npc;
             dc->jump_pc[0] = target;
@@ -1447,8 +1448,7 @@ static void do_fbranch(DisasContext *dc, int32_t offset, uint32_t insn, int cc)
         flush_cond(dc);
         gen_fcond(cpu_cond, cc, cond);
         if (a) {
-            gen_branch_a(dc, target, dc->npc, cpu_cond);
-            dc->is_br = 1;
+            gen_branch_a(dc, target);
         } else {
             dc->pc = dc->npc;
             dc->jump_pc[0] = target;
@@ -1476,8 +1476,7 @@ static void do_branch_reg(DisasContext *dc, int32_t offset, uint32_t insn,
     flush_cond(dc);
     gen_cond_reg(cpu_cond, cond, r_reg);
     if (a) {
-        gen_branch_a(dc, target, dc->npc, cpu_cond);
-        dc->is_br = 1;
+        gen_branch_a(dc, target);
     } else {
         dc->pc = dc->npc;
         dc->jump_pc[0] = target;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 13/25] target-sparc: Split out gen_branch_n
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (11 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 12/25] target-sparc: Tidy gen_branch_a interface Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-24 19:42   ` Aurelien Jarno
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 14/25] target-sparc: Remove gen_opc_jump_pc Richard Henderson
                   ` (11 subsequent siblings)
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

Unify three copies of this code from different
branch types.  Fix the case when npc == DYNAMIC_PC,
i.e. a branch within a delay slot.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c | 55 ++++++++++++++++++++++++------------------------
 1 file changed, 28 insertions(+), 27 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index cbc90d8..c6a8d86 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -970,6 +970,31 @@ static void gen_branch_a(DisasContext *dc, target_ulong pc1)
     dc->is_br = 1;
 }
 
+static void gen_branch_n(DisasContext *dc, target_ulong pc1)
+{
+    target_ulong npc = dc->npc;
+
+    if (likely(npc != DYNAMIC_PC)) {
+        dc->pc = npc;
+        dc->jump_pc[0] = pc1;
+        dc->jump_pc[1] = npc + 4;
+        dc->npc = JUMP_PC;
+    } else {
+        TCGv t, z;
+
+        tcg_gen_mov_tl(cpu_pc, cpu_npc);
+
+        tcg_gen_addi_tl(cpu_npc, cpu_npc, 4);
+        t = tcg_const_tl(pc1);
+        z = tcg_const_tl(0);
+        tcg_gen_movcond_tl(TCG_COND_NE, cpu_npc, cpu_cond, z, t, cpu_npc);
+        tcg_temp_free(t);
+        tcg_temp_free(z);
+
+        dc->pc = DYNAMIC_PC;
+    }
+}
+
 static inline void gen_generic_branch(DisasContext *dc)
 {
     TCGv npc0 = tcg_const_tl(dc->jump_pc[0]);
@@ -1402,15 +1427,7 @@ static void do_branch(DisasContext *dc, int32_t offset, uint32_t insn, int cc)
         if (a) {
             gen_branch_a(dc, target);
         } else {
-            dc->pc = dc->npc;
-            dc->jump_pc[0] = target;
-            if (unlikely(dc->npc == DYNAMIC_PC)) {
-                dc->jump_pc[1] = DYNAMIC_PC;
-                tcg_gen_addi_tl(cpu_pc, cpu_npc, 4);
-            } else {
-                dc->jump_pc[1] = dc->npc + 4;
-                dc->npc = JUMP_PC;
-            }
+            gen_branch_n(dc, target);
         }
     }
 }
@@ -1450,15 +1467,7 @@ static void do_fbranch(DisasContext *dc, int32_t offset, uint32_t insn, int cc)
         if (a) {
             gen_branch_a(dc, target);
         } else {
-            dc->pc = dc->npc;
-            dc->jump_pc[0] = target;
-            if (unlikely(dc->npc == DYNAMIC_PC)) {
-                dc->jump_pc[1] = DYNAMIC_PC;
-                tcg_gen_addi_tl(cpu_pc, cpu_npc, 4);
-            } else {
-                dc->jump_pc[1] = dc->npc + 4;
-                dc->npc = JUMP_PC;
-            }
+            gen_branch_n(dc, target);
         }
     }
 }
@@ -1478,15 +1487,7 @@ static void do_branch_reg(DisasContext *dc, int32_t offset, uint32_t insn,
     if (a) {
         gen_branch_a(dc, target);
     } else {
-        dc->pc = dc->npc;
-        dc->jump_pc[0] = target;
-        if (unlikely(dc->npc == DYNAMIC_PC)) {
-            dc->jump_pc[1] = DYNAMIC_PC;
-            tcg_gen_addi_tl(cpu_pc, cpu_npc, 4);
-        } else {
-            dc->jump_pc[1] = dc->npc + 4;
-            dc->npc = JUMP_PC;
-        }
+        gen_branch_n(dc, target);
     }
 }
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 14/25] target-sparc: Remove gen_opc_jump_pc
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (12 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 13/25] target-sparc: Split out gen_branch_n Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-24 19:42   ` Aurelien Jarno
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 15/25] target-sparc: Add npc state to insn_start Richard Henderson
                   ` (10 subsequent siblings)
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

Since jump_pc[1] is always npc + 4, we can infer after incrementing
that jump_pc[1] == pc + 4.  Because of that, we can encode the branch
destination into a single word, and store that in npc.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/translate.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index c6a8d86..25b5bc0 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -65,7 +65,6 @@ static TCGv cpu_wim;
 static TCGv_i64 cpu_fpr[TARGET_DPREGS];
 
 static target_ulong gen_opc_npc[OPC_BUF_SIZE];
-static target_ulong gen_opc_jump_pc[2];
 
 #include "exec/gen-icount.h"
 
@@ -5250,6 +5249,10 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
                     tcg_ctx.gen_opc_instr_start[lj++] = 0;
                 tcg_ctx.gen_opc_pc[lj] = dc->pc;
                 gen_opc_npc[lj] = dc->npc;
+                if (dc->npc & JUMP_PC) {
+                    assert(dc->jump_pc[1] == dc->pc + 4);
+                    gen_opc_npc[lj] = dc->jump_pc[0] | JUMP_PC;
+                }
                 tcg_ctx.gen_opc_instr_start[lj] = 1;
                 tcg_ctx.gen_opc_icount[lj] = num_insns;
             }
@@ -5321,8 +5324,6 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
 #if 0
         log_page_dump();
 #endif
-        gen_opc_jump_pc[0] = dc->jump_pc[0];
-        gen_opc_jump_pc[1] = dc->jump_pc[1];
     } else {
         tb->size = last_pc + 4 - pc_start;
         tb->icount = num_insns;
@@ -5450,17 +5451,17 @@ void gen_intermediate_code_init(CPUSPARCState *env)
 
 void restore_state_to_opc(CPUSPARCState *env, TranslationBlock *tb, int pc_pos)
 {
-    target_ulong npc;
-    env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+    target_ulong pc, npc;
+    env->pc = pc = tcg_ctx.gen_opc_pc[pc_pos];
     npc = gen_opc_npc[pc_pos];
-    if (npc == 1) {
+    if (npc == DYNAMIC_PC) {
         /* dynamic NPC: already stored */
-    } else if (npc == 2) {
+    } else if (npc & JUMP_PC) {
         /* jump PC: use 'cond' and the jump targets of the translation */
         if (env->cond) {
-            env->npc = gen_opc_jump_pc[0];
+            env->npc = npc & ~3;
         } else {
-            env->npc = gen_opc_jump_pc[1];
+            env->npc = pc + 4;
         }
     } else {
         env->npc = npc;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 15/25] target-sparc: Add npc state to insn_start
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (13 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 14/25] target-sparc: Remove gen_opc_jump_pc Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-24 19:42   ` Aurelien Jarno
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 16/25] tcg: Merge cpu_gen_code into tb_gen_code Richard Henderson
                   ` (9 subsequent siblings)
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-sparc/cpu.h       | 1 +
 target-sparc/translate.c | 7 ++++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h
index 72ea171..ac8f383 100644
--- a/target-sparc/cpu.h
+++ b/target-sparc/cpu.h
@@ -236,6 +236,7 @@ typedef struct trap_state {
     uint32_t tt;
 } trap_state;
 #endif
+#define TARGET_INSN_START_EXTRA_WORDS 1
 
 typedef struct sparc_def_t {
     const char *name;
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 25b5bc0..6e5b82d 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -5257,7 +5257,12 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
                 tcg_ctx.gen_opc_icount[lj] = num_insns;
             }
         }
-        tcg_gen_insn_start(dc->pc);
+        if (dc->npc & JUMP_PC) {
+            assert(dc->jump_pc[1] == dc->pc + 4);
+            tcg_gen_insn_start(dc->pc, dc->jump_pc[0] | JUMP_PC);
+        } else {
+            tcg_gen_insn_start(dc->pc, dc->npc);
+        }
         num_insns++;
 
         if (unlikely(cpu_breakpoint_test(cs, dc->pc, BP_ANY))) {
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 16/25] tcg: Merge cpu_gen_code into tb_gen_code
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (14 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 15/25] target-sparc: Add npc state to insn_start Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-24 19:48   ` Aurelien Jarno
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 17/25] target-*: Drop cpu_gen_code define Richard Henderson
                   ` (8 subsequent siblings)
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

As it's only caller, this tidies things a bit.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 include/exec/exec-all.h |   2 -
 translate-all.c         | 131 ++++++++++++++++++++++--------------------------
 2 files changed, 59 insertions(+), 74 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index a3719b7..5340745 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -78,8 +78,6 @@ void restore_state_to_opc(CPUArchState *env, struct TranslationBlock *tb,
                           int pc_pos);
 
 void cpu_gen_init(void);
-int cpu_gen_code(CPUArchState *env, struct TranslationBlock *tb,
-                 int *gen_code_size_ptr);
 bool cpu_restore_state(CPUState *cpu, uintptr_t searched_pc);
 void page_size_init(void);
 
diff --git a/translate-all.c b/translate-all.c
index 4a9ee33..19c2988 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -168,73 +168,7 @@ void cpu_gen_init(void)
     tcg_context_init(&tcg_ctx); 
 }
 
-/* return non zero if the very first instruction is invalid so that
- * the virtual CPU can trigger an exception.
- *
- * '*gen_code_size_ptr' contains the size of the generated code (host
- * code).
- *
- * Called with mmap_lock held for user-mode emulation.
- */
-int cpu_gen_code(CPUArchState *env, TranslationBlock *tb, int *gen_code_size_ptr)
-{
-    TCGContext *s = &tcg_ctx;
-    tcg_insn_unit *gen_code_buf;
-    int gen_code_size;
-#ifdef CONFIG_PROFILER
-    int64_t ti;
-#endif
-
-#ifdef CONFIG_PROFILER
-    s->tb_count1++; /* includes aborted translations because of
-                       exceptions */
-    ti = profile_getclock();
-#endif
-    tcg_func_start(s);
-
-    gen_intermediate_code(env, tb);
-
-    trace_translate_block(tb, tb->pc, tb->tc_ptr);
-
-    /* generate machine code */
-    gen_code_buf = tb->tc_ptr;
-    tb->tb_next_offset[0] = 0xffff;
-    tb->tb_next_offset[1] = 0xffff;
-    s->tb_next_offset = tb->tb_next_offset;
-#ifdef USE_DIRECT_JUMP
-    s->tb_jmp_offset = tb->tb_jmp_offset;
-    s->tb_next = NULL;
-#else
-    s->tb_jmp_offset = NULL;
-    s->tb_next = tb->tb_next;
-#endif
-
-#ifdef CONFIG_PROFILER
-    s->tb_count++;
-    s->interm_time += profile_getclock() - ti;
-    s->code_time -= profile_getclock();
-#endif
-    gen_code_size = tcg_gen_code(s, gen_code_buf);
-    *gen_code_size_ptr = gen_code_size;
-#ifdef CONFIG_PROFILER
-    s->code_time += profile_getclock();
-    s->code_in_len += tb->size;
-    s->code_out_len += gen_code_size;
-#endif
-
-#ifdef DEBUG_DISAS
-    if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
-        qemu_log("OUT: [size=%d]\n", gen_code_size);
-        log_disas(tb->tc_ptr, gen_code_size);
-        qemu_log("\n");
-        qemu_log_flush();
-    }
-#endif
-    return 0;
-}
-
-/* The cpu state corresponding to 'searched_pc' is restored.
- */
+/* The cpu state corresponding to 'searched_pc' is restored.  */
 static int cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
                                      uintptr_t searched_pc)
 {
@@ -1034,7 +968,11 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     TranslationBlock *tb;
     tb_page_addr_t phys_pc, phys_page2;
     target_ulong virt_page2;
-    int code_gen_size;
+    tcg_insn_unit *gen_code_buf;
+    int gen_code_size;
+#ifdef CONFIG_PROFILER
+    int64_t ti;
+#endif
 
     phys_pc = get_page_addr_code(env, pc);
     if (use_icount) {
@@ -1049,13 +987,62 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
         /* Don't forget to invalidate previous TB info.  */
         tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
     }
-    tb->tc_ptr = tcg_ctx.code_gen_ptr;
+
+    gen_code_buf = tcg_ctx.code_gen_ptr;
+    tb->tc_ptr = gen_code_buf;
     tb->cs_base = cs_base;
     tb->flags = flags;
     tb->cflags = cflags;
-    cpu_gen_code(env, tb, &code_gen_size);
-    tcg_ctx.code_gen_ptr = (void *)(((uintptr_t)tcg_ctx.code_gen_ptr +
-            code_gen_size + CODE_GEN_ALIGN - 1) & ~(CODE_GEN_ALIGN - 1));
+
+#ifdef CONFIG_PROFILER
+    tcg_ctx.tb_count1++; /* includes aborted translations because of
+                       exceptions */
+    ti = profile_getclock();
+#endif
+
+    tcg_func_start(&tcg_ctx);
+
+    gen_intermediate_code(env, tb);
+
+    trace_translate_block(tb, tb->pc, tb->tc_ptr);
+
+    /* generate machine code */
+    tb->tb_next_offset[0] = 0xffff;
+    tb->tb_next_offset[1] = 0xffff;
+    tcg_ctx.tb_next_offset = tb->tb_next_offset;
+#ifdef USE_DIRECT_JUMP
+    tcg_ctx.tb_jmp_offset = tb->tb_jmp_offset;
+    tcg_ctx.tb_next = NULL;
+#else
+    tcg_ctx.tb_jmp_offset = NULL;
+    tcg_ctx.tb_next = tb->tb_next;
+#endif
+
+#ifdef CONFIG_PROFILER
+    tcg_ctx.tb_count++;
+    tcg_ctx.interm_time += profile_getclock() - ti;
+    tcg_ctx.code_time -= profile_getclock();
+#endif
+
+    gen_code_size = tcg_gen_code(&tcg_ctx, gen_code_buf);
+
+#ifdef CONFIG_PROFILER
+    tcg_ctx.code_time += profile_getclock();
+    tcg_ctx.code_in_len += tb->size;
+    tcg_ctx.code_out_len += gen_code_size;
+#endif
+
+#ifdef DEBUG_DISAS
+    if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
+        qemu_log("OUT: [size=%d]\n", gen_code_size);
+        log_disas(tb->tc_ptr, gen_code_size);
+        qemu_log("\n");
+        qemu_log_flush();
+    }
+#endif
+
+    tcg_ctx.code_gen_ptr = (void *)(((uintptr_t)gen_code_buf +
+            gen_code_size + CODE_GEN_ALIGN - 1) & ~(CODE_GEN_ALIGN - 1));
 
     /* check next page if needed */
     virt_page2 = (pc + tb->size - 1) & TARGET_PAGE_MASK;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 17/25] target-*: Drop cpu_gen_code define
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (15 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 16/25] tcg: Merge cpu_gen_code into tb_gen_code Richard Henderson
@ 2015-09-22 20:24 ` Richard Henderson
  2015-09-24 19:49   ` Aurelien Jarno
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 18/25] tcg: Add TCG_MAX_INSNS Richard Henderson
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

This symbol no longer exists.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-alpha/cpu.h      | 1 -
 target-arm/cpu.h        | 1 -
 target-cris/cpu.h       | 1 -
 target-i386/cpu.h       | 1 -
 target-lm32/cpu.h       | 1 -
 target-m68k/cpu.h       | 1 -
 target-microblaze/cpu.h | 1 -
 target-mips/cpu.h       | 1 -
 target-moxie/cpu.h      | 1 -
 target-openrisc/cpu.h   | 1 -
 target-ppc/cpu.h        | 1 -
 target-s390x/cpu.h      | 1 -
 target-sh4/cpu.h        | 1 -
 target-sparc/cpu.h      | 1 -
 target-tilegx/cpu.h     | 1 -
 target-xtensa/cpu.h     | 1 -
 16 files changed, 16 deletions(-)

diff --git a/target-alpha/cpu.h b/target-alpha/cpu.h
index ef88ffb..ce074fa 100644
--- a/target-alpha/cpu.h
+++ b/target-alpha/cpu.h
@@ -289,7 +289,6 @@ struct CPUAlphaState {
 
 #define cpu_list alpha_cpu_list
 #define cpu_exec cpu_alpha_exec
-#define cpu_gen_code cpu_alpha_gen_code
 #define cpu_signal_handler cpu_alpha_signal_handler
 
 #include "exec/cpu-all.h"
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index c4a7400..21f90e4 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -1603,7 +1603,6 @@ static inline bool arm_excp_unmasked(CPUState *cs, unsigned int excp_idx,
 #define cpu_init(cpu_model) CPU(cpu_arm_init(cpu_model))
 
 #define cpu_exec cpu_arm_exec
-#define cpu_gen_code cpu_arm_gen_code
 #define cpu_signal_handler cpu_arm_signal_handler
 #define cpu_list arm_cpu_list
 
diff --git a/target-cris/cpu.h b/target-cris/cpu.h
index 8ae7708..99dd90e 100644
--- a/target-cris/cpu.h
+++ b/target-cris/cpu.h
@@ -225,7 +225,6 @@ enum {
 #define cpu_init(cpu_model) CPU(cpu_cris_init(cpu_model))
 
 #define cpu_exec cpu_cris_exec
-#define cpu_gen_code cpu_cris_gen_code
 #define cpu_signal_handler cpu_cris_signal_handler
 
 /* MMU modes definitions */
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 717d558..280d5f2 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -1190,7 +1190,6 @@ uint64_t cpu_get_tsc(CPUX86State *env);
 #define cpu_init(cpu_model) CPU(cpu_x86_init(cpu_model))
 
 #define cpu_exec cpu_x86_exec
-#define cpu_gen_code cpu_x86_gen_code
 #define cpu_signal_handler cpu_x86_signal_handler
 #define cpu_list x86_cpu_list
 #define cpudef_setup x86_cpudef_setup
diff --git a/target-lm32/cpu.h b/target-lm32/cpu.h
index cc77263..d40b9f7 100644
--- a/target-lm32/cpu.h
+++ b/target-lm32/cpu.h
@@ -221,7 +221,6 @@ bool lm32_cpu_do_semihosting(CPUState *cs);
 
 #define cpu_list lm32_cpu_list
 #define cpu_exec cpu_lm32_exec
-#define cpu_gen_code cpu_lm32_gen_code
 #define cpu_signal_handler cpu_lm32_signal_handler
 
 int lm32_cpu_handle_mmu_fault(CPUState *cpu, vaddr address, int rw,
diff --git a/target-m68k/cpu.h b/target-m68k/cpu.h
index 43a9a1c..705a3f9 100644
--- a/target-m68k/cpu.h
+++ b/target-m68k/cpu.h
@@ -215,7 +215,6 @@ void register_m68k_insns (CPUM68KState *env);
 #define cpu_init(cpu_model) CPU(cpu_m68k_init(cpu_model))
 
 #define cpu_exec cpu_m68k_exec
-#define cpu_gen_code cpu_m68k_gen_code
 #define cpu_signal_handler cpu_m68k_signal_handler
 #define cpu_list m68k_cpu_list
 
diff --git a/target-microblaze/cpu.h b/target-microblaze/cpu.h
index 402124a..e05e6ad 100644
--- a/target-microblaze/cpu.h
+++ b/target-microblaze/cpu.h
@@ -297,7 +297,6 @@ int cpu_mb_signal_handler(int host_signum, void *pinfo,
 #define cpu_init(cpu_model) CPU(cpu_mb_init(cpu_model))
 
 #define cpu_exec cpu_mb_exec
-#define cpu_gen_code cpu_mb_gen_code
 #define cpu_signal_handler cpu_mb_signal_handler
 
 /* MMU modes definitions */
diff --git a/target-mips/cpu.h b/target-mips/cpu.h
index fd23832..77ff614 100644
--- a/target-mips/cpu.h
+++ b/target-mips/cpu.h
@@ -622,7 +622,6 @@ void mips_cpu_unassigned_access(CPUState *cpu, hwaddr addr,
 void mips_cpu_list (FILE *f, fprintf_function cpu_fprintf);
 
 #define cpu_exec cpu_mips_exec
-#define cpu_gen_code cpu_mips_gen_code
 #define cpu_signal_handler cpu_mips_signal_handler
 #define cpu_list mips_cpu_list
 
diff --git a/target-moxie/cpu.h b/target-moxie/cpu.h
index 15ca15b..821bfee 100644
--- a/target-moxie/cpu.h
+++ b/target-moxie/cpu.h
@@ -124,7 +124,6 @@ int cpu_moxie_signal_handler(int host_signum, void *pinfo,
 #define cpu_init(cpu_model) CPU(cpu_moxie_init(cpu_model))
 
 #define cpu_exec cpu_moxie_exec
-#define cpu_gen_code cpu_moxie_gen_code
 #define cpu_signal_handler cpu_moxie_signal_handler
 
 static inline int cpu_mmu_index(CPUMoxieState *env, bool ifetch)
diff --git a/target-openrisc/cpu.h b/target-openrisc/cpu.h
index 560210d9..952ebba 100644
--- a/target-openrisc/cpu.h
+++ b/target-openrisc/cpu.h
@@ -361,7 +361,6 @@ int cpu_openrisc_signal_handler(int host_signum, void *pinfo, void *puc);
 
 #define cpu_list cpu_openrisc_list
 #define cpu_exec cpu_openrisc_exec
-#define cpu_gen_code cpu_openrisc_gen_code
 #define cpu_signal_handler cpu_openrisc_signal_handler
 
 #ifndef CONFIG_USER_ONLY
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 406d308..2d527e5 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1241,7 +1241,6 @@ int ppc_dcr_write (ppc_dcr_t *dcr_env, int dcrn, uint32_t val);
 #define cpu_init(cpu_model) CPU(cpu_ppc_init(cpu_model))
 
 #define cpu_exec cpu_ppc_exec
-#define cpu_gen_code cpu_ppc_gen_code
 #define cpu_signal_handler cpu_ppc_signal_handler
 #define cpu_list ppc_cpu_list
 
diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
index 68d6528..95015d5 100644
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -599,7 +599,6 @@ bool css_present(uint8_t cssid);
 
 #define cpu_init(model) CPU(cpu_s390x_init(model))
 #define cpu_exec cpu_s390x_exec
-#define cpu_gen_code cpu_s390x_gen_code
 #define cpu_signal_handler cpu_s390x_signal_handler
 
 void s390_cpu_list(FILE *f, fprintf_function cpu_fprintf);
diff --git a/target-sh4/cpu.h b/target-sh4/cpu.h
index ea854cb..8d44ce7 100644
--- a/target-sh4/cpu.h
+++ b/target-sh4/cpu.h
@@ -228,7 +228,6 @@ void cpu_load_tlb(CPUSH4State * env);
 #define cpu_init(cpu_model) CPU(cpu_sh4_init(cpu_model))
 
 #define cpu_exec cpu_sh4_exec
-#define cpu_gen_code cpu_sh4_gen_code
 #define cpu_signal_handler cpu_sh4_signal_handler
 #define cpu_list sh4_cpu_list
 
diff --git a/target-sparc/cpu.h b/target-sparc/cpu.h
index ac8f383..f24e9c4 100644
--- a/target-sparc/cpu.h
+++ b/target-sparc/cpu.h
@@ -599,7 +599,6 @@ int cpu_sparc_signal_handler(int host_signum, void *pinfo, void *puc);
 #endif
 
 #define cpu_exec cpu_sparc_exec
-#define cpu_gen_code cpu_sparc_gen_code
 #define cpu_signal_handler cpu_sparc_signal_handler
 #define cpu_list sparc_cpu_list
 
diff --git a/target-tilegx/cpu.h b/target-tilegx/cpu.h
index b9f5082..bcc1fdb 100644
--- a/target-tilegx/cpu.h
+++ b/target-tilegx/cpu.h
@@ -163,7 +163,6 @@ TileGXCPU *cpu_tilegx_init(const char *cpu_model);
 #define cpu_init(cpu_model) CPU(cpu_tilegx_init(cpu_model))
 
 #define cpu_exec cpu_tilegx_exec
-#define cpu_gen_code cpu_tilegx_gen_code
 #define cpu_signal_handler cpu_tilegx_signal_handler
 
 static inline void cpu_get_tb_cpu_state(CPUTLGState *env, target_ulong *pc,
diff --git a/target-xtensa/cpu.h b/target-xtensa/cpu.h
index dbd2c9c..c692470 100644
--- a/target-xtensa/cpu.h
+++ b/target-xtensa/cpu.h
@@ -383,7 +383,6 @@ typedef struct CPUXtensaState {
 #include "cpu-qom.h"
 
 #define cpu_exec cpu_xtensa_exec
-#define cpu_gen_code cpu_xtensa_gen_code
 #define cpu_signal_handler cpu_xtensa_signal_handler
 #define cpu_list xtensa_cpu_list
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 18/25] tcg: Add TCG_MAX_INSNS
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (16 preceding siblings ...)
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 17/25] target-*: Drop cpu_gen_code define Richard Henderson
@ 2015-09-22 20:25 ` Richard Henderson
  2015-09-24 20:02   ` Aurelien Jarno
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 19/25] tcg: Pass data argument to restore_state_to_opc Richard Henderson
                   ` (6 subsequent siblings)
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

Adjust all translators to respect it.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-alpha/translate.c      |  3 +++
 target-arm/translate-a64.c    |  3 +++
 target-arm/translate.c        |  6 +++++-
 target-cris/translate.c       |  3 +++
 target-i386/translate.c       |  6 +++++-
 target-lm32/translate.c       |  3 +++
 target-m68k/translate.c       |  6 +++++-
 target-microblaze/translate.c |  6 +++++-
 target-mips/translate.c       |  7 ++++++-
 target-moxie/translate.c      | 13 +++++++++++--
 target-openrisc/translate.c   |  3 +++
 target-ppc/translate.c        |  6 +++++-
 target-s390x/translate.c      |  3 +++
 target-sh4/translate.c        |  7 ++++++-
 target-sparc/translate.c      |  7 ++++++-
 target-tilegx/translate.c     |  3 +++
 target-tricore/translate.c    | 20 +++++++++++++-------
 target-unicore32/translate.c  |  3 +++
 target-xtensa/translate.c     |  3 +++
 tcg/tcg.h                     |  1 +
 20 files changed, 95 insertions(+), 17 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index c10193e..538e202 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -2903,6 +2903,9 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
     if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
     }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
     if (in_superpage(&ctx, pc_start)) {
         pc_mask = (1ULL << 41) - 1;
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 654a586..5022fc3 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -11072,6 +11072,9 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
     if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
     }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
     gen_tb_start(tb);
 
diff --git a/target-arm/translate.c b/target-arm/translate.c
index fb69ecb..fedb781 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11258,8 +11258,12 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
     lj = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
-    if (max_insns == 0)
+    if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
+    }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
     gen_tb_start(tb);
 
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 3d55a6a..d038bdb 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3155,6 +3155,9 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
     if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
     }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
     gen_tb_start(tb);
     do {
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 7501b91..d3282e8 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7932,8 +7932,12 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
     lj = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
-    if (max_insns == 0)
+    if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
+    }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
     gen_tb_start(tb);
     for(;;) {
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index 8ea7929..e16c31a 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1069,6 +1069,9 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
     if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
     }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
     gen_tb_start(tb);
     do {
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index afef37f..185c565 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2991,8 +2991,12 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
     lj = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
-    if (max_insns == 0)
+    if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
+    }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
     gen_tb_start(tb);
     do {
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index 1224456..58b27ca 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1674,8 +1674,12 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
     lj = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
-    if (max_insns == 0)
+    if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
+    }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
     gen_tb_start(tb);
     do
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 30d7d46..c0a0674 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -19586,8 +19586,13 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
                                  MO_UNALN : MO_ALIGN;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
-    if (max_insns == 0)
+    if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
+    }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
+
     LOG_DISAS("\ntb %p idx %d hflags %04x\n", tb, ctx.mem_idx, ctx.hflags);
     gen_tb_start(tb);
     while (ctx.bstate == BS_NONE) {
diff --git a/target-moxie/translate.c b/target-moxie/translate.c
index d71f55b..68588da 100644
--- a/target-moxie/translate.c
+++ b/target-moxie/translate.c
@@ -824,7 +824,7 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
     target_ulong pc_start;
     int j, lj = -1;
     CPUMoxieState *env = &cpu->env;
-    int num_insns;
+    int num_insns, max_insns;
 
     pc_start = tb->pc;
     ctx.pc = pc_start;
@@ -834,6 +834,13 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
     ctx.singlestep_enabled = 0;
     ctx.bstate = BS_NONE;
     num_insns = 0;
+    max_insns = tb->cflags & CF_COUNT_MASK;
+    if (max_insns == 0) {
+        max_insns = CF_COUNT_MASK;
+    }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
     gen_tb_start(tb);
     do {
@@ -862,10 +869,12 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
         ctx.opcode = cpu_lduw_code(env, ctx.pc);
         ctx.pc += decode_opc(cpu, &ctx);
 
+        if (num_insns >= max_insns) {
+            break;
+        }
         if (cs->singlestep_enabled) {
             break;
         }
-
         if ((ctx.pc & (TARGET_PAGE_SIZE - 1)) == 0) {
             break;
         }
diff --git a/target-openrisc/translate.c b/target-openrisc/translate.c
index 9755850..7573d34 100644
--- a/target-openrisc/translate.c
+++ b/target-openrisc/translate.c
@@ -1654,6 +1654,9 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
     if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
     }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
     gen_tb_start(tb);
 
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index fc234a3..2dc6fb4 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11475,8 +11475,12 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
 #endif
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
-    if (max_insns == 0)
+    if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
+    }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
     gen_tb_start(tb);
     tcg_clear_temp_count();
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index 6bbc760..b1aa139 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -5352,6 +5352,9 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
     if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
     }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
     gen_tb_start(tb);
 
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index efaa6f6..b48b9bb 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1844,8 +1844,13 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
     ii = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
-    if (max_insns == 0)
+    if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
+    }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
+
     gen_tb_start(tb);
     while (ctx.bstate == BS_NONE && !tcg_op_buf_full()) {
         if (search_pc) {
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 6e5b82d..e6ecd21 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -5236,8 +5236,13 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
 
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
-    if (max_insns == 0)
+    if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
+    }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
+
     gen_tb_start(tb);
     do {
         if (spc) {
diff --git a/target-tilegx/translate.c b/target-tilegx/translate.c
index c23b761..a6b9cd8 100644
--- a/target-tilegx/translate.c
+++ b/target-tilegx/translate.c
@@ -2081,6 +2081,9 @@ static inline void gen_intermediate_code_internal(TileGXCPU *cpu,
     if (cs->singlestep_enabled || singlestep) {
         max_insns = 1;
     }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
     gen_tb_start(tb);
 
     while (1) {
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index fa10d5c..5345486 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -8274,13 +8274,24 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
     CPUTriCoreState *env = &cpu->env;
     DisasContext ctx;
     target_ulong pc_start;
-    int num_insns;
+    int num_insns, max_insns;
 
     if (search_pc) {
         qemu_log("search pc %d\n", search_pc);
     }
 
     num_insns = 0;
+    max_insns = tb->cflags & CF_COUNT_MASK;
+    if (max_insns == 0) {
+        max_insns = CF_COUNT_MASK;
+    }
+    if (singlestep) {
+        max_insns = 1;
+    }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
+
     pc_start = tb->pc;
     ctx.pc = pc_start;
     ctx.saved_pc = -1;
@@ -8298,12 +8309,7 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
         ctx.opcode = cpu_ldl_code(env, ctx.pc);
         decode_opc(env, &ctx, 0);
 
-        if (tcg_op_buf_full()) {
-            gen_save_pc(ctx.next_pc);
-            tcg_gen_exit_tb(0);
-            break;
-        }
-        if (singlestep) {
+        if (num_insns >= max_insns || tcg_op_buf_full()) {
             gen_save_pc(ctx.next_pc);
             tcg_gen_exit_tb(0);
             break;
diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c
index cd23c4b..5e61e38 100644
--- a/target-unicore32/translate.c
+++ b/target-unicore32/translate.c
@@ -1900,6 +1900,9 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
     if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
     }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
 #ifndef CONFIG_USER_ONLY
     if ((env->uncached_asr & ASR_M) == ASR_MODE_USER) {
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index ea87cb5..4d4dc06 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -3014,6 +3014,9 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
     if (max_insns == 0) {
         max_insns = CF_COUNT_MASK;
     }
+    if (max_insns > TCG_MAX_INSNS) {
+        max_insns = TCG_MAX_INSNS;
+    }
 
     dc.config = env->config;
     dc.singlestep_enabled = cs->singlestep_enabled;
diff --git a/tcg/tcg.h b/tcg/tcg.h
index c975076..151e17d 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -194,6 +194,7 @@ typedef struct TCGPool {
 #define TCG_POOL_CHUNK_SIZE 32768
 
 #define TCG_MAX_TEMPS 512
+#define TCG_MAX_INSNS 512
 
 /* when the size of the arguments of a called function is smaller than
    this value, they are statically allocated in the TB stack frame */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 19/25] tcg: Pass data argument to restore_state_to_opc
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (17 preceding siblings ...)
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 18/25] tcg: Add TCG_MAX_INSNS Richard Henderson
@ 2015-09-22 20:25 ` Richard Henderson
  2015-09-24 20:11   ` Aurelien Jarno
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 20/25] tcg: Save insn data and use it in cpu_restore_state_from_tb Richard Henderson
                   ` (5 subsequent siblings)
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

The gen_opc_* arrays are already redundant with the data stored in
the insn_start arguments.  Transition restore_state_to_opc to use
data from the latter.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 include/exec/exec-all.h       |  2 +-
 target-alpha/translate.c      |  5 +++--
 target-arm/translate.c        |  9 +++++----
 target-cris/translate.c       |  5 +++--
 target-i386/translate.c       | 26 ++++++--------------------
 target-lm32/translate.c       |  5 +++--
 target-m68k/translate.c       |  5 +++--
 target-microblaze/translate.c |  5 +++--
 target-mips/translate.c       |  9 +++++----
 target-moxie/translate.c      |  5 +++--
 target-openrisc/translate.c   |  4 ++--
 target-ppc/translate.c        |  5 +++--
 target-s390x/translate.c      |  8 ++++----
 target-sh4/translate.c        |  7 ++++---
 target-sparc/translate.c      | 10 ++++++----
 target-tilegx/translate.c     |  5 +++--
 target-tricore/translate.c    |  5 +++--
 target-unicore32/translate.c  |  5 +++--
 target-xtensa/translate.c     |  5 +++--
 tcg/tcg.c                     | 11 ++++++++++-
 tcg/tcg.h                     |  2 ++
 translate-all.c               |  2 +-
 22 files changed, 79 insertions(+), 66 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 5340745..6a69802 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -75,7 +75,7 @@ typedef struct TranslationBlock TranslationBlock;
 void gen_intermediate_code(CPUArchState *env, struct TranslationBlock *tb);
 void gen_intermediate_code_pc(CPUArchState *env, struct TranslationBlock *tb);
 void restore_state_to_opc(CPUArchState *env, struct TranslationBlock *tb,
-                          int pc_pos);
+                          target_ulong *data);
 
 void cpu_gen_init(void);
 bool cpu_restore_state(CPUState *cpu, uintptr_t searched_pc);
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 538e202..8395a30 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -3023,7 +3023,8 @@ void gen_intermediate_code_pc (CPUAlphaState *env, struct TranslationBlock *tb)
     gen_intermediate_code_internal(alpha_env_get_cpu(env), tb, true);
 }
 
-void restore_state_to_opc(CPUAlphaState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUAlphaState *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+    env->pc = data[0];
 }
diff --git a/target-arm/translate.c b/target-arm/translate.c
index fedb781..2296953 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11612,13 +11612,14 @@ void arm_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
     }
 }
 
-void restore_state_to_opc(CPUARMState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUARMState *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
     if (is_a64(env)) {
-        env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+        env->pc = data[0];
         env->condexec_bits = 0;
     } else {
-        env->regs[15] = tcg_ctx.gen_opc_pc[pc_pos];
-        env->condexec_bits = gen_opc_condexec_bits[pc_pos];
+        env->regs[15] = data[0];
+        env->condexec_bits = data[1];
     }
 }
diff --git a/target-cris/translate.c b/target-cris/translate.c
index d038bdb..77e2794 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3433,7 +3433,8 @@ void cris_initialize_tcg(void)
     }
 }
 
-void restore_state_to_opc(CPUCRISState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUCRISState *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+    env->pc = data[0];
 }
diff --git a/target-i386/translate.c b/target-i386/translate.c
index d3282e8..2f7b77f 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -8055,26 +8055,12 @@ void gen_intermediate_code_pc(CPUX86State *env, TranslationBlock *tb)
     gen_intermediate_code_internal(x86_env_get_cpu(env), tb, true);
 }
 
-void restore_state_to_opc(CPUX86State *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUX86State *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    int cc_op;
-#ifdef DEBUG_DISAS
-    if (qemu_loglevel_mask(CPU_LOG_TB_OP)) {
-        int i;
-        qemu_log("RESTORE:\n");
-        for(i = 0;i <= pc_pos; i++) {
-            if (tcg_ctx.gen_opc_instr_start[i]) {
-                qemu_log("0x%04x: " TARGET_FMT_lx "\n", i,
-                        tcg_ctx.gen_opc_pc[i]);
-            }
-        }
-        qemu_log("pc_pos=0x%x eip=" TARGET_FMT_lx " cs_base=%x\n",
-                pc_pos, tcg_ctx.gen_opc_pc[pc_pos] - tb->cs_base,
-                (uint32_t)tb->cs_base);
-    }
-#endif
-    env->eip = tcg_ctx.gen_opc_pc[pc_pos] - tb->cs_base;
-    cc_op = gen_opc_cc_op[pc_pos];
-    if (cc_op != CC_OP_DYNAMIC)
+    int cc_op = data[1];
+    env->eip = data[0] - tb->cs_base;
+    if (cc_op != CC_OP_DYNAMIC) {
         env->cc_op = cc_op;
+    }
 }
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index e16c31a..3379d2c 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1207,9 +1207,10 @@ void lm32_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
     cpu_fprintf(f, "\n\n");
 }
 
-void restore_state_to_opc(CPULM32State *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPULM32State *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+    env->pc = data[0];
 }
 
 void lm32_translate_init(void)
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 185c565..ce8150e 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -3118,7 +3118,8 @@ void m68k_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
     cpu_fprintf (f, "FPRESULT = %12g\n", *(double *)&env->fp_result);
 }
 
-void restore_state_to_opc(CPUM68KState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUM68KState *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+    env->pc = data[0];
 }
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index 58b27ca..973c744 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1928,7 +1928,8 @@ void mb_tcg_init(void)
     }
 }
 
-void restore_state_to_opc(CPUMBState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUMBState *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    env->sregs[SR_PC] = tcg_ctx.gen_opc_pc[pc_pos];
+    env->sregs[SR_PC] = data[0];
 }
diff --git a/target-mips/translate.c b/target-mips/translate.c
index c0a0674..56f00e6 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -20061,18 +20061,19 @@ void cpu_state_reset(CPUMIPSState *env)
     }
 }
 
-void restore_state_to_opc(CPUMIPSState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUMIPSState *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    env->active_tc.PC = tcg_ctx.gen_opc_pc[pc_pos];
+    env->active_tc.PC = data[0];
     env->hflags &= ~MIPS_HFLAG_BMASK;
-    env->hflags |= gen_opc_hflags[pc_pos];
+    env->hflags |= data[1];
     switch (env->hflags & MIPS_HFLAG_BMASK_BASE) {
     case MIPS_HFLAG_BR:
         break;
     case MIPS_HFLAG_BC:
     case MIPS_HFLAG_BL:
     case MIPS_HFLAG_B:
-        env->btarget = gen_opc_btarget[pc_pos];
+        env->btarget = data[2];
         break;
     }
 }
diff --git a/target-moxie/translate.c b/target-moxie/translate.c
index 68588da..c007764 100644
--- a/target-moxie/translate.c
+++ b/target-moxie/translate.c
@@ -922,7 +922,8 @@ void gen_intermediate_code_pc(CPUMoxieState *env, struct TranslationBlock *tb)
     gen_intermediate_code_internal(moxie_env_get_cpu(env), tb, true);
 }
 
-void restore_state_to_opc(CPUMoxieState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUMoxieState *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+    env->pc = data[0];
 }
diff --git a/target-openrisc/translate.c b/target-openrisc/translate.c
index 7573d34..26bf87f 100644
--- a/target-openrisc/translate.c
+++ b/target-openrisc/translate.c
@@ -1794,7 +1794,7 @@ void openrisc_cpu_dump_state(CPUState *cs, FILE *f,
 }
 
 void restore_state_to_opc(CPUOpenRISCState *env, TranslationBlock *tb,
-                          int pc_pos)
+                          target_ulong *data)
 {
-    env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+    env->pc = data[0];
 }
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 2dc6fb4..5c2dc38 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11629,7 +11629,8 @@ void gen_intermediate_code_pc (CPUPPCState *env, struct TranslationBlock *tb)
     gen_intermediate_code_internal(ppc_env_get_cpu(env), tb, true);
 }
 
-void restore_state_to_opc(CPUPPCState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUPPCState *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    env->nip = tcg_ctx.gen_opc_pc[pc_pos];
+    env->nip = data[0];
 }
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index b1aa139..104265d 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -5460,11 +5460,11 @@ void gen_intermediate_code_pc (CPUS390XState *env, struct TranslationBlock *tb)
     gen_intermediate_code_internal(s390_env_get_cpu(env), tb, true);
 }
 
-void restore_state_to_opc(CPUS390XState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUS390XState *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    int cc_op;
-    env->psw.addr = tcg_ctx.gen_opc_pc[pc_pos];
-    cc_op = gen_opc_cc_op[pc_pos];
+    int cc_op = data[1];
+    env->psw.addr = data[0];
     if ((cc_op != CC_OP_DYNAMIC) && (cc_op != CC_OP_STATIC)) {
         env->cc_op = cc_op;
     }
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index b48b9bb..d3fe1de 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1950,8 +1950,9 @@ void gen_intermediate_code_pc(CPUSH4State * env, struct TranslationBlock *tb)
     gen_intermediate_code_internal(sh_env_get_cpu(env), tb, true);
 }
 
-void restore_state_to_opc(CPUSH4State *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUSH4State *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    env->pc = tcg_ctx.gen_opc_pc[pc_pos];
-    env->flags = gen_opc_hflags[pc_pos];
+    env->pc = data[0];
+    env->flags = data[1];
 }
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index e6ecd21..18344c8 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -5459,11 +5459,13 @@ void gen_intermediate_code_init(CPUSPARCState *env)
     }
 }
 
-void restore_state_to_opc(CPUSPARCState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUSPARCState *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    target_ulong pc, npc;
-    env->pc = pc = tcg_ctx.gen_opc_pc[pc_pos];
-    npc = gen_opc_npc[pc_pos];
+    target_ulong pc = data[0];
+    target_ulong npc = data[1];
+
+    env->pc = pc;
     if (npc == DYNAMIC_PC) {
         /* dynamic NPC: already stored */
     } else if (npc & JUMP_PC) {
diff --git a/target-tilegx/translate.c b/target-tilegx/translate.c
index a6b9cd8..eae5622 100644
--- a/target-tilegx/translate.c
+++ b/target-tilegx/translate.c
@@ -2144,9 +2144,10 @@ void gen_intermediate_code_pc(CPUTLGState *env, struct TranslationBlock *tb)
     gen_intermediate_code_internal(tilegx_env_get_cpu(env), tb, true);
 }
 
-void restore_state_to_opc(CPUTLGState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUTLGState *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+    env->pc = data[0];
 }
 
 void tilegx_tcg_init(void)
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index 5345486..6f5438f 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -8350,9 +8350,10 @@ gen_intermediate_code_pc(CPUTriCoreState *env, struct TranslationBlock *tb)
 }
 
 void
-restore_state_to_opc(CPUTriCoreState *env, TranslationBlock *tb, int pc_pos)
+restore_state_to_opc(CPUTriCoreState *env, TranslationBlock *tb,
+                     target_ulong *data)
 {
-    env->PC = tcg_ctx.gen_opc_pc[pc_pos];
+    env->PC = data[0];
 }
 /*
  *
diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c
index 5e61e38..9d8167a 100644
--- a/target-unicore32/translate.c
+++ b/target-unicore32/translate.c
@@ -2129,7 +2129,8 @@ void uc32_cpu_dump_state(CPUState *cs, FILE *f,
     cpu_dump_state_ucf64(env, f, cpu_fprintf, flags);
 }
 
-void restore_state_to_opc(CPUUniCore32State *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUUniCore32State *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    env->regs[31] = tcg_ctx.gen_opc_pc[pc_pos];
+    env->regs[31] = data[0];
 }
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index 4d4dc06..ac967e1 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -3202,7 +3202,8 @@ void xtensa_cpu_dump_state(CPUState *cs, FILE *f,
     }
 }
 
-void restore_state_to_opc(CPUXtensaState *env, TranslationBlock *tb, int pc_pos)
+void restore_state_to_opc(CPUXtensaState *env, TranslationBlock *tb,
+                          target_ulong *data)
 {
-    env->pc = tcg_ctx.gen_opc_pc[pc_pos];
+    env->pc = data[0];
 }
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 3308d68..bdb83d9 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2294,7 +2294,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
                                       tcg_insn_unit *gen_code_buf,
                                       long search_pc)
 {
-    int oi, oi_next;
+    int i, oi, oi_next;
 
 #ifdef DEBUG_DISAS
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP))) {
@@ -2361,6 +2361,15 @@ static inline int tcg_gen_code_common(TCGContext *s,
             tcg_reg_alloc_movi(s, args, dead_args, sync_args);
             break;
         case INDEX_op_insn_start:
+            for (i = 0; i < TARGET_INSN_START_WORDS; ++i) {
+                target_ulong a;
+#if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
+                a = ((target_ulong)args[i * 2 + 1] << 32) | args[i * 2];
+#else
+                a = args[i];
+#endif
+                s->gen_opc_data[i] = a;
+            }
             break;
         case INDEX_op_discard:
             temp_dead(s, args[0]);
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 151e17d..8fd1252 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -580,6 +580,8 @@ struct TCGContext {
     target_ulong gen_opc_pc[OPC_BUF_SIZE];
     uint16_t gen_opc_icount[OPC_BUF_SIZE];
     uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
+
+    target_ulong gen_opc_data[TARGET_INSN_START_WORDS];
 };
 
 extern TCGContext tcg_ctx;
diff --git a/translate-all.c b/translate-all.c
index 19c2988..9f801ae 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -218,7 +218,7 @@ static int cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
     }
     cpu->icount_decr.u16.low -= s->gen_opc_icount[j];
 
-    restore_state_to_opc(env, tb, j);
+    restore_state_to_opc(env, tb, s->gen_opc_data);
 
 #ifdef CONFIG_PROFILER
     s->restore_time += profile_getclock() - ti;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 20/25] tcg: Save insn data and use it in cpu_restore_state_from_tb
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (18 preceding siblings ...)
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 19/25] tcg: Pass data argument to restore_state_to_opc Richard Henderson
@ 2015-09-22 20:25 ` Richard Henderson
  2015-09-23 19:20   ` Peter Maydell
  2015-09-25 21:10   ` Aurelien Jarno
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 21/25] tcg: Remove gen_intermediate_code_pc Richard Henderson
                   ` (4 subsequent siblings)
  24 siblings, 2 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

We can now restore state without retranslation.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 include/exec/exec-all.h |   1 +
 tcg/tcg.c               |  40 ++++++++-----
 tcg/tcg.h               |   4 +-
 translate-all.c         | 149 +++++++++++++++++++++++++++++++++++-------------
 4 files changed, 139 insertions(+), 55 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 6a69802..402dd87 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -199,6 +199,7 @@ struct TranslationBlock {
 #define CF_USE_ICOUNT  0x20000
 
     void *tc_ptr;    /* pointer to the translated code */
+    uint8_t *tc_search;  /* pointer to search data */
     /* next matching tb for physical address. */
     struct TranslationBlock *phys_hash_next;
     /* original tb when cflags has CF_NOCACHE */
diff --git a/tcg/tcg.c b/tcg/tcg.c
index bdb83d9..a0fce5b 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2294,7 +2294,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
                                       tcg_insn_unit *gen_code_buf,
                                       long search_pc)
 {
-    int i, oi, oi_next;
+    int i, oi, oi_next, num_insns;
 
 #ifdef DEBUG_DISAS
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP))) {
@@ -2338,6 +2338,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
 
     tcg_out_tb_init(s);
 
+    num_insns = -1;
     for (oi = s->gen_first_op_idx; oi >= 0; oi = oi_next) {
         TCGOp * const op = &s->gen_op_buf[oi];
         TCGArg * const args = &s->gen_opparam_buf[op->args];
@@ -2361,6 +2362,10 @@ static inline int tcg_gen_code_common(TCGContext *s,
             tcg_reg_alloc_movi(s, args, dead_args, sync_args);
             break;
         case INDEX_op_insn_start:
+            if (num_insns >= 0) {
+                s->gen_insn_end_off[num_insns] = tcg_current_code_size(s);
+            }
+            num_insns++;
             for (i = 0; i < TARGET_INSN_START_WORDS; ++i) {
                 target_ulong a;
 #if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
@@ -2368,7 +2373,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
 #else
                 a = args[i];
 #endif
-                s->gen_opc_data[i] = a;
+                s->gen_insn_data[num_insns][i] = a;
             }
             break;
         case INDEX_op_discard:
@@ -2400,6 +2405,8 @@ static inline int tcg_gen_code_common(TCGContext *s,
         check_regs(s);
 #endif
     }
+    tcg_debug_assert(num_insns >= 0);
+    s->gen_insn_end_off[num_insns] = tcg_current_code_size(s);
 
     /* Generate TB finalization at the end of block */
     tcg_out_tb_finalize(s);
@@ -2448,24 +2455,26 @@ int tcg_gen_code_search_pc(TCGContext *s, tcg_insn_unit *gen_code_buf,
 void tcg_dump_info(FILE *f, fprintf_function cpu_fprintf)
 {
     TCGContext *s = &tcg_ctx;
-    int64_t tot;
+    int64_t tb_count = s->tb_count;
+    int64_t tb_div_count = tb_count ? tb_count : 1;
+    int64_t tot = s->interm_time + s->code_time;
 
-    tot = s->interm_time + s->code_time;
     cpu_fprintf(f, "JIT cycles          %" PRId64 " (%0.3f s at 2.4 GHz)\n",
                 tot, tot / 2.4e9);
     cpu_fprintf(f, "translated TBs      %" PRId64 " (aborted=%" PRId64 " %0.1f%%)\n", 
-                s->tb_count, 
-                s->tb_count1 - s->tb_count,
-                s->tb_count1 ? (double)(s->tb_count1 - s->tb_count) / s->tb_count1 * 100.0 : 0);
+                tb_count, s->tb_count1 - tb_count,
+                (double)(s->tb_count1 - s->tb_count)
+                / (s->tb_count1 ? s->tb_count1 : 1) * 100.0);
     cpu_fprintf(f, "avg ops/TB          %0.1f max=%d\n", 
-                s->tb_count ? (double)s->op_count / s->tb_count : 0, s->op_count_max);
+                (double)s->op_count / tb_div_count, s->op_count_max);
     cpu_fprintf(f, "deleted ops/TB      %0.2f\n",
-                s->tb_count ? 
-                (double)s->del_op_count / s->tb_count : 0);
+                (double)s->del_op_count / tb_div_count);
     cpu_fprintf(f, "avg temps/TB        %0.2f max=%d\n",
-                s->tb_count ? 
-                (double)s->temp_count / s->tb_count : 0,
-                s->temp_count_max);
+                (double)s->temp_count / tb_div_count, s->temp_count_max);
+    cpu_fprintf(f, "avg host code/TB    %0.1f\n",
+                (double)s->code_out_len / tb_div_count);
+    cpu_fprintf(f, "avg search data/TB  %0.1f\n",
+                (double)s->search_out_len / tb_div_count);
     
     cpu_fprintf(f, "cycles/op           %0.1f\n", 
                 s->op_count ? (double)tot / s->op_count : 0);
@@ -2473,8 +2482,11 @@ void tcg_dump_info(FILE *f, fprintf_function cpu_fprintf)
                 s->code_in_len ? (double)tot / s->code_in_len : 0);
     cpu_fprintf(f, "cycles/out byte     %0.1f\n", 
                 s->code_out_len ? (double)tot / s->code_out_len : 0);
-    if (tot == 0)
+    cpu_fprintf(f, "cycles/search byte     %0.1f\n", 
+                s->search_out_len ? (double)tot / s->search_out_len : 0);
+    if (tot == 0) {
         tot = 1;
+    }
     cpu_fprintf(f, "  gen_interm time   %0.1f%%\n", 
                 (double)s->interm_time / tot * 100.0);
     cpu_fprintf(f, "  gen_code time     %0.1f%%\n", 
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 8fd1252..df499c6 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -532,6 +532,7 @@ struct TCGContext {
     int64_t del_op_count;
     int64_t code_in_len;
     int64_t code_out_len;
+    int64_t search_out_len;
     int64_t interm_time;
     int64_t code_time;
     int64_t la_time;
@@ -581,7 +582,8 @@ struct TCGContext {
     uint16_t gen_opc_icount[OPC_BUF_SIZE];
     uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
 
-    target_ulong gen_opc_data[TARGET_INSN_START_WORDS];
+    uint16_t gen_insn_end_off[TCG_MAX_INSNS];
+    target_ulong gen_insn_data[TCG_MAX_INSNS][TARGET_INSN_START_WORDS];
 };
 
 extern TCGContext tcg_ctx;
diff --git a/translate-all.c b/translate-all.c
index 9f801ae..f6b8148 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -168,61 +168,127 @@ void cpu_gen_init(void)
     tcg_context_init(&tcg_ctx); 
 }
 
+/* Encode VAL as a signed leb128 sequence at P.
+   Return P incremented past the encoded value.  */
+static uint8_t *encode_sleb128(uint8_t *p, target_long val)
+{
+    int more, byte;
+
+    do {
+        byte = val & 0x7f;
+        val >>= 7;
+        more = !((val == 0 && (byte & 0x40) == 0)
+                 || (val == -1 && (byte & 0x40) != 0));
+        if (more)
+          byte |= 0x80;
+        *p++ = byte;
+    } while (more);
+
+    return p;
+}
+
+/* Decode a signed leb128 sequence at *PP; increment *PP past the
+   decoded value.  Return the decoded value.  */
+static target_long decode_sleb128(uint8_t **pp)
+{
+    uint8_t *p = *pp;
+    target_long val = 0;
+    int byte, shift = 0;
+
+    do {
+        byte = *p++;
+        val |= (target_ulong)(byte & 0x7f) << shift;
+        shift += 7;
+    } while (byte & 0x80);
+    if (shift < TARGET_LONG_BITS && (byte & 0x40)) {
+        val |= -(target_ulong)1 << shift;
+    }
+
+    *pp = p;
+    return val;
+}
+
+/* Encode the data collected about the instructions while compiling TB.
+   Place the data at BLOCK, and return the number of bytes consumed.
+
+   The logical table consisits of TARGET_INSN_START_WORDS target_ulong's,
+   which come from the target's insn_start data, followed by a uintptr_t
+   which comes from the host pc of the end of the code implementing the insn.
+
+   Each line of the table is encoded as sleb128 deltas from the previous
+   line.  The seed for the first line is { tb->pc, 0..., tb->tc_ptr }.
+   That is, the first column is seeded with the guest pc, the last column
+   with the host pc, and the middle columns with zeros.  */
+
+static int encode_search(TranslationBlock *tb, uint8_t *block)
+{
+    uint8_t *p = block;
+    int i, j, n;
+
+    tb->tc_search = block;
+
+    for (i = 0, n = tb->icount; i < n; ++i) {
+        target_ulong prev;
+
+        for (j = 0; j < TARGET_INSN_START_WORDS; ++j) {
+            if (i == 0) {
+                prev = (j == 0 ? tb->pc : 0);
+            } else {
+                prev = tcg_ctx.gen_insn_data[i - 1][j];
+            }
+            p = encode_sleb128(p, tcg_ctx.gen_insn_data[i][j] - prev);
+        }
+        prev = (i == 0 ? 0 : tcg_ctx.gen_insn_end_off[i - 1]);
+        p = encode_sleb128(p, tcg_ctx.gen_insn_end_off[i] - prev);
+    }
+
+    return p - block;
+}
+
 /* The cpu state corresponding to 'searched_pc' is restored.  */
 static int cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
                                      uintptr_t searched_pc)
 {
+    target_ulong data[TARGET_INSN_START_WORDS] = { tb->pc };
+    uintptr_t host_pc = (uintptr_t)tb->tc_ptr;
     CPUArchState *env = cpu->env_ptr;
-    TCGContext *s = &tcg_ctx;
-    int j;
-    uintptr_t tc_ptr;
+    uint8_t *p = tb->tc_search;
+    int i, j, num_insns = tb->icount;
 #ifdef CONFIG_PROFILER
-    int64_t ti;
+    int64_t ti = profile_getclock();
 #endif
 
-#ifdef CONFIG_PROFILER
-    ti = profile_getclock();
-#endif
-    tcg_func_start(s);
+    if (searched_pc < host_pc) {
+        return -1;
+    }
 
-    gen_intermediate_code_pc(env, tb);
+    /* Reconstruct the stored insn data while looking for the point at
+       which the end of the insn exceeds the searched_pc.  */
+    for (i = 0; i < num_insns; ++i) {
+        for (j = 0; j < TARGET_INSN_START_WORDS; ++j) {
+            data[j] += decode_sleb128(&p);
+        }
+        host_pc += decode_sleb128(&p);
+        if (host_pc > searched_pc) {
+            goto found;
+        }
+    }
+    return -1;
 
+ found:
     if (tb->cflags & CF_USE_ICOUNT) {
         assert(use_icount);
         /* Reset the cycle counter to the start of the block.  */
-        cpu->icount_decr.u16.low += tb->icount;
+        cpu->icount_decr.u16.low += num_insns;
         /* Clear the IO flag.  */
         cpu->can_do_io = 0;
     }
-
-    /* find opc index corresponding to search_pc */
-    tc_ptr = (uintptr_t)tb->tc_ptr;
-    if (searched_pc < tc_ptr)
-        return -1;
-
-    s->tb_next_offset = tb->tb_next_offset;
-#ifdef USE_DIRECT_JUMP
-    s->tb_jmp_offset = tb->tb_jmp_offset;
-    s->tb_next = NULL;
-#else
-    s->tb_jmp_offset = NULL;
-    s->tb_next = tb->tb_next;
-#endif
-    j = tcg_gen_code_search_pc(s, (tcg_insn_unit *)tc_ptr,
-                               searched_pc - tc_ptr);
-    if (j < 0)
-        return -1;
-    /* now find start of instruction before */
-    while (s->gen_opc_instr_start[j] == 0) {
-        j--;
-    }
-    cpu->icount_decr.u16.low -= s->gen_opc_icount[j];
-
-    restore_state_to_opc(env, tb, s->gen_opc_data);
+    cpu->icount_decr.u16.low -= i;
+    restore_state_to_opc(env, tb, data);
 
 #ifdef CONFIG_PROFILER
-    s->restore_time += profile_getclock() - ti;
-    s->restore_count++;
+    tcg_ctx.restore_time += profile_getclock() - ti;
+    tcg_ctx.restore_count++;
 #endif
     return 0;
 }
@@ -969,7 +1035,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     tb_page_addr_t phys_pc, phys_page2;
     target_ulong virt_page2;
     tcg_insn_unit *gen_code_buf;
-    int gen_code_size;
+    int gen_code_size, search_size;
 #ifdef CONFIG_PROFILER
     int64_t ti;
 #endif
@@ -1025,11 +1091,13 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
 #endif
 
     gen_code_size = tcg_gen_code(&tcg_ctx, gen_code_buf);
+    search_size = encode_search(tb, (void *)gen_code_buf + gen_code_size);
 
 #ifdef CONFIG_PROFILER
     tcg_ctx.code_time += profile_getclock();
     tcg_ctx.code_in_len += tb->size;
     tcg_ctx.code_out_len += gen_code_size;
+    tcg_ctx.search_out_len += search_size;
 #endif
 
 #ifdef DEBUG_DISAS
@@ -1041,8 +1109,9 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     }
 #endif
 
-    tcg_ctx.code_gen_ptr = (void *)(((uintptr_t)gen_code_buf +
-            gen_code_size + CODE_GEN_ALIGN - 1) & ~(CODE_GEN_ALIGN - 1));
+    tcg_ctx.code_gen_ptr = (void *)
+        ROUND_UP((uintptr_t)gen_code_buf + gen_code_size + search_size,
+                 CODE_GEN_ALIGN);
 
     /* check next page if needed */
     virt_page2 = (pc + tb->size - 1) & TARGET_PAGE_MASK;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 21/25] tcg: Remove gen_intermediate_code_pc
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (19 preceding siblings ...)
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 20/25] tcg: Save insn data and use it in cpu_restore_state_from_tb Richard Henderson
@ 2015-09-22 20:25 ` Richard Henderson
  2015-09-25 21:11   ` Aurelien Jarno
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 22/25] tcg: Remove tcg_gen_code_search_pc Richard Henderson
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

It is no longer used, so tidy up everything reached by it.
This includes the gen_opc_* arrays, the search_pc parameter
and the inline gen_intermediate_code_internal functions.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 include/exec/exec-all.h       |  1 -
 target-alpha/translate.c      | 41 ++++----------------------------
 target-arm/translate-a64.c    | 30 +++---------------------
 target-arm/translate.c        | 54 ++++++++-----------------------------------
 target-arm/translate.h        |  8 ++-----
 target-cris/translate.c       | 50 +++++----------------------------------
 target-i386/translate.c       | 49 ++++-----------------------------------
 target-lm32/translate.c       | 42 ++++-----------------------------
 target-m68k/translate.c       | 43 ++++------------------------------
 target-microblaze/translate.c | 40 ++++----------------------------
 target-mips/translate.c       | 48 ++++----------------------------------
 target-moxie/translate.c      | 41 ++++----------------------------
 target-openrisc/translate.c   | 42 ++++-----------------------------
 target-ppc/translate.c        | 40 ++++----------------------------
 target-s390x/translate.c      | 44 ++++-------------------------------
 target-sh4/translate.c        | 43 ++++------------------------------
 target-sparc/translate.c      | 51 ++++------------------------------------
 target-tilegx/translate.c     | 41 ++++----------------------------
 target-tricore/translate.c    | 31 ++++---------------------
 target-unicore32/translate.c  | 44 ++++-------------------------------
 target-xtensa/translate.c     | 39 ++++---------------------------
 tcg/tcg.h                     |  4 ----
 22 files changed, 90 insertions(+), 736 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 402dd87..6871e78 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -73,7 +73,6 @@ typedef struct TranslationBlock TranslationBlock;
 #include "qemu/log.h"
 
 void gen_intermediate_code(CPUArchState *env, struct TranslationBlock *tb);
-void gen_intermediate_code_pc(CPUArchState *env, struct TranslationBlock *tb);
 void restore_state_to_opc(CPUArchState *env, struct TranslationBlock *tb,
                           target_ulong *data);
 
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 8395a30..f936d1b 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -2858,17 +2858,14 @@ static ExitStatus translate_one(DisasContext *ctx, uint32_t insn)
     return ret;
 }
 
-static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
-                                                  TranslationBlock *tb,
-                                                  bool search_pc)
+void gen_intermediate_code(CPUAlphaState *env, struct TranslationBlock *tb)
 {
+    AlphaCPU *cpu = alpha_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUAlphaState *env = &cpu->env;
     DisasContext ctx, *ctxp = &ctx;
     target_ulong pc_start;
     target_ulong pc_mask;
     uint32_t insn;
-    int j, lj = -1;
     ExitStatus ret;
     int num_insns;
     int max_insns;
@@ -2915,18 +2912,6 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
 
     gen_tb_start(tb);
     do {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j) {
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-                }
-            }
-            tcg_ctx.gen_opc_pc[lj] = ctx.pc;
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(ctx.pc);
         num_insns++;
 
@@ -2993,16 +2978,8 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
 
     gen_tb_end(tb, num_insns);
 
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j) {
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-        }
-    } else {
-        tb->size = ctx.pc - pc_start;
-        tb->icount = num_insns;
-    }
+    tb->size = ctx.pc - pc_start;
+    tb->icount = num_insns;
 
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -3013,16 +2990,6 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
 #endif
 }
 
-void gen_intermediate_code (CPUAlphaState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(alpha_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc (CPUAlphaState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(alpha_env_get_cpu(env), tb, true);
-}
-
 void restore_state_to_opc(CPUAlphaState *env, TranslationBlock *tb,
                           target_ulong *data)
 {
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 5022fc3..e65e309 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -11000,14 +11000,11 @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s)
     free_tmp_a64(s);
 }
 
-void gen_intermediate_code_internal_a64(ARMCPU *cpu,
-                                        TranslationBlock *tb,
-                                        bool search_pc)
+void gen_intermediate_code_a64(ARMCPU *cpu, TranslationBlock *tb)
 {
     CPUState *cs = CPU(cpu);
     CPUARMState *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
-    int j, lj;
     target_ulong pc_start;
     target_ulong next_page_start;
     int num_insns;
@@ -11066,7 +11063,6 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
     init_tmp_a64_array(dc);
 
     next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
-    lj = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
     if (max_insns == 0) {
@@ -11081,18 +11077,6 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
     tcg_clear_temp_count();
 
     do {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j) {
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-                }
-            }
-            tcg_ctx.gen_opc_pc[lj] = dc->pc;
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(dc->pc, 0);
         num_insns++;
 
@@ -11221,14 +11205,6 @@ done_generating:
         qemu_log("\n");
     }
 #endif
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j) {
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-        }
-    } else {
-        tb->size = dc->pc - pc_start;
-        tb->icount = num_insns;
-    }
+    tb->size = dc->pc - pc_start;
+    tb->icount = num_insns;
 }
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 2296953..22c3587 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -52,7 +52,6 @@
 #define ARCH(x) do { if (!ENABLE_ARCH_##x) goto illegal_op; } while(0)
 
 #include "translate.h"
-static uint32_t gen_opc_condexec_bits[OPC_BUF_SIZE];
 
 #if defined(CONFIG_USER_ONLY)
 #define IS_USER(s) 1
@@ -11168,16 +11167,12 @@ undef:
 }
 
 /* generate intermediate code in gen_opc_buf and gen_opparam_buf for
-   basic block 'tb'. If search_pc is TRUE, also generate PC
-   information for each intermediate instruction. */
-static inline void gen_intermediate_code_internal(ARMCPU *cpu,
-                                                  TranslationBlock *tb,
-                                                  bool search_pc)
+   basic block 'tb'.  */
+void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb)
 {
+    ARMCPU *cpu = arm_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUARMState *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
-    int j, lj;
     target_ulong pc_start;
     target_ulong next_page_start;
     int num_insns;
@@ -11189,7 +11184,7 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
      * the A32/T32 complexity to do with conditional execution/IT blocks/etc.
      */
     if (ARM_TBFLAG_AARCH64_STATE(tb->flags)) {
-        gen_intermediate_code_internal_a64(cpu, tb, search_pc);
+        gen_intermediate_code_a64(cpu, tb);
         return;
     }
 
@@ -11255,7 +11250,6 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
     /* FIXME: cpu_M0 can probably be the same as cpu_V0.  */
     cpu_M0 = tcg_temp_new_i64();
     next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
-    lj = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
     if (max_insns == 0) {
@@ -11289,10 +11283,9 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
      * (3) if we leave the TB unexpectedly (eg a data abort on a load)
      * then the CPUARMState will be wrong and we need to reset it.
      * This is handled in the same way as restoration of the
-     * PC in these situations: we will be called again with search_pc=1
-     * and generate a mapping of the condexec bits for each PC in
-     * gen_opc_condexec_bits[]. restore_state_to_opc() then uses
-     * this to restore the condexec bits.
+     * PC in these situations; we save the value of the condexec bits
+     * for each PC via tcg_gen_insn_start(), and restore_state_to_opc()
+     * then uses this to restore them after an exception.
      *
      * Note that there are no instructions which can read the condexec
      * bits, and none which can write non-static values to them, so
@@ -11309,18 +11302,6 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
         store_cpu_field(tmp, condexec_bits);
       }
     do {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j)
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-            }
-            tcg_ctx.gen_opc_pc[lj] = dc->pc;
-            gen_opc_condexec_bits[lj] = (dc->condexec_cond << 4) | (dc->condexec_mask >> 1);
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(dc->pc,
                            (dc->condexec_cond << 4) | (dc->condexec_mask >> 1));
         num_insns++;
@@ -11537,25 +11518,8 @@ done_generating:
         qemu_log("\n");
     }
 #endif
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j)
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-    } else {
-        tb->size = dc->pc - pc_start;
-        tb->icount = num_insns;
-    }
-}
-
-void gen_intermediate_code(CPUARMState *env, TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(arm_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUARMState *env, TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(arm_env_get_cpu(env), tb, true);
+    tb->size = dc->pc - pc_start;
+    tb->icount = num_insns;
 }
 
 static const char *cpu_mode_names[16] = {
diff --git a/target-arm/translate.h b/target-arm/translate.h
index b8fe37a..53ef971 100644
--- a/target-arm/translate.h
+++ b/target-arm/translate.h
@@ -122,9 +122,7 @@ static inline int default_exception_el(DisasContext *s)
 
 #ifdef TARGET_AARCH64
 void a64_translate_init(void);
-void gen_intermediate_code_internal_a64(ARMCPU *cpu,
-                                        TranslationBlock *tb,
-                                        bool search_pc);
+void gen_intermediate_code_a64(ARMCPU *cpu, TranslationBlock *tb);
 void gen_a64_set_pc_im(uint64_t val);
 void aarch64_cpu_dump_state(CPUState *cs, FILE *f,
                             fprintf_function cpu_fprintf, int flags);
@@ -133,9 +131,7 @@ static inline void a64_translate_init(void)
 {
 }
 
-static inline void gen_intermediate_code_internal_a64(ARMCPU *cpu,
-                                                      TranslationBlock *tb,
-                                                      bool search_pc)
+static inline void gen_intermediate_code_a64(ARMCPU *cpu, TranslationBlock *tb)
 {
 }
 
diff --git a/target-cris/translate.c b/target-cris/translate.c
index 77e2794..964845c 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3067,15 +3067,12 @@ static unsigned int crisv32_decoder(CPUCRISState *env, DisasContext *dc)
  */
 
 /* generate intermediate code for basic block 'tb'.  */
-static inline void
-gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
-                               bool search_pc)
+void gen_intermediate_code(CPUCRISState *env, struct TranslationBlock *tb)
 {
+    CRISCPU *cpu = cris_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUCRISState *env = &cpu->env;
     uint32_t pc_start;
     unsigned int insn_len;
-    int j, lj;
     struct DisasContext ctx;
     struct DisasContext *dc = &ctx;
     uint32_t next_page_start;
@@ -3127,13 +3124,13 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
 
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
         qemu_log(
-                "srch=%d pc=%x %x flg=%" PRIx64 " bt=%x ds=%u ccs=%x\n"
+                "pc=%x %x flg=%" PRIx64 " bt=%x ds=%u ccs=%x\n"
                 "pid=%x usp=%x\n"
                 "%x.%x.%x.%x\n"
                 "%x.%x.%x.%x\n"
                 "%x.%x.%x.%x\n"
                 "%x.%x.%x.%x\n",
-                search_pc, dc->pc, dc->ppc,
+                dc->pc, dc->ppc,
                 (uint64_t)tb->flags,
                 env->btarget, (unsigned)tb->flags & 7,
                 env->pregs[PR_CCS],
@@ -3149,7 +3146,6 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
     }
 
     next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
-    lj = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
     if (max_insns == 0) {
@@ -3161,22 +3157,6 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
 
     gen_tb_start(tb);
     do {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j) {
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-                }
-            }
-            if (dc->delayed_branch == 1) {
-                tcg_ctx.gen_opc_pc[lj] = dc->ppc | 1;
-            } else {
-                tcg_ctx.gen_opc_pc[lj] = dc->pc;
-            }
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(dc->delayed_branch == 1
                            ? dc->ppc | 1 : dc->pc);
         num_insns++;
@@ -3308,16 +3288,8 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
     }
     gen_tb_end(tb, num_insns);
 
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j) {
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-        }
-    } else {
-        tb->size = dc->pc - pc_start;
-        tb->icount = num_insns;
-    }
+    tb->size = dc->pc - pc_start;
+    tb->icount = num_insns;
 
 #ifdef DEBUG_DISAS
 #if !DISAS_CRIS
@@ -3331,16 +3303,6 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
 #endif
 }
 
-void gen_intermediate_code (CPUCRISState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(cris_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc (CPUCRISState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(cris_env_get_cpu(env), tb, true);
-}
-
 void cris_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
                          int flags)
 {
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 2f7b77f..ef10e68 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -75,8 +75,6 @@ static TCGv_ptr cpu_ptr0, cpu_ptr1;
 static TCGv_i32 cpu_tmp2_i32, cpu_tmp3_i32;
 static TCGv_i64 cpu_tmp1_i64;
 
-static uint8_t gen_opc_cc_op[OPC_BUF_SIZE];
-
 #include "exec/gen-icount.h"
 
 #ifdef TARGET_X86_64
@@ -7839,17 +7837,13 @@ void optimize_flags_init(void)
 }
 
 /* generate intermediate code in gen_opc_buf and gen_opparam_buf for
-   basic block 'tb'. If search_pc is TRUE, also generate PC
-   information for each intermediate instruction. */
-static inline void gen_intermediate_code_internal(X86CPU *cpu,
-                                                  TranslationBlock *tb,
-                                                  bool search_pc)
+   basic block 'tb'.  */
+void gen_intermediate_code(CPUX86State *env, TranslationBlock *tb)
 {
+    X86CPU *cpu = x86_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUX86State *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
     target_ulong pc_ptr;
-    int j, lj;
     uint64_t flags;
     target_ulong pc_start;
     target_ulong cs_base;
@@ -7929,7 +7923,6 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
 
     dc->is_jmp = DISAS_NEXT;
     pc_ptr = pc_start;
-    lj = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
     if (max_insns == 0) {
@@ -7941,18 +7934,6 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
 
     gen_tb_start(tb);
     for(;;) {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j)
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-            }
-            tcg_ctx.gen_opc_pc[lj] = pc_ptr;
-            gen_opc_cc_op[lj] = dc->cc_op;
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(pc_ptr, dc->cc_op);
         num_insns++;
 
@@ -8015,14 +7996,6 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
 done_generating:
     gen_tb_end(tb, num_insns);
 
-    /* we don't forget to fill the last values */
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j)
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-    }
-
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
         int disas_flags;
@@ -8039,20 +8012,8 @@ done_generating:
     }
 #endif
 
-    if (!search_pc) {
-        tb->size = pc_ptr - pc_start;
-        tb->icount = num_insns;
-    }
-}
-
-void gen_intermediate_code(CPUX86State *env, TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(x86_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUX86State *env, TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(x86_env_get_cpu(env), tb, true);
+    tb->size = pc_ptr - pc_start;
+    tb->icount = num_insns;
 }
 
 void restore_state_to_opc(CPUX86State *env, TranslationBlock *tb,
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index 3379d2c..c61ad0f 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1033,15 +1033,12 @@ static inline void decode(DisasContext *dc, uint32_t ir)
 }
 
 /* generate intermediate code for basic block 'tb'.  */
-static inline
-void gen_intermediate_code_internal(LM32CPU *cpu,
-                                    TranslationBlock *tb, bool search_pc)
+void gen_intermediate_code(CPULM32State *env, struct TranslationBlock *tb)
 {
+    LM32CPU *cpu = lm32_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPULM32State *env = &cpu->env;
     struct DisasContext ctx, *dc = &ctx;
     uint32_t pc_start;
-    int j, lj;
     uint32_t next_page_start;
     int num_insns;
     int max_insns;
@@ -1063,7 +1060,6 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
     }
 
     next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
-    lj = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
     if (max_insns == 0) {
@@ -1075,18 +1071,6 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
 
     gen_tb_start(tb);
     do {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j) {
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-                }
-            }
-            tcg_ctx.gen_opc_pc[lj] = dc->pc;
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
@@ -1142,16 +1126,8 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
 
     gen_tb_end(tb, num_insns);
 
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j) {
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-        }
-    } else {
-        tb->size = dc->pc - pc_start;
-        tb->icount = num_insns;
-    }
+    tb->size = dc->pc - pc_start;
+    tb->icount = num_insns;
 
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -1163,16 +1139,6 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
 #endif
 }
 
-void gen_intermediate_code(CPULM32State *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(lm32_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPULM32State *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(lm32_env_get_cpu(env), tb, true);
-}
-
 void lm32_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
                          int flags)
 {
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index ce8150e..5995cce 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2962,14 +2962,11 @@ static void disas_m68k_insn(CPUM68KState * env, DisasContext *s)
 }
 
 /* generate intermediate code for basic block 'tb'.  */
-static inline void
-gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
-                               bool search_pc)
+void gen_intermediate_code(CPUM68KState *env, TranslationBlock *tb)
 {
+    M68kCPU *cpu = m68k_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUM68KState *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
-    int j, lj;
     target_ulong pc_start;
     int pc_offset;
     int num_insns;
@@ -2988,7 +2985,6 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
     dc->fpcr = env->fpcr;
     dc->user = (env->sr & SR_S) == 0;
     dc->done_mac = 0;
-    lj = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
     if (max_insns == 0) {
@@ -3002,17 +2998,6 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
     do {
         pc_offset = dc->pc - pc_start;
         gen_throws_exception = NULL;
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j)
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-            }
-            tcg_ctx.gen_opc_pc[lj] = dc->pc;
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
@@ -3071,28 +3056,8 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
         qemu_log("\n");
     }
 #endif
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j)
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-    } else {
-        tb->size = dc->pc - pc_start;
-        tb->icount = num_insns;
-    }
-
-    //optimize_flags();
-    //expand_target_qops();
-}
-
-void gen_intermediate_code(CPUM68KState *env, TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(m68k_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUM68KState *env, TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(m68k_env_get_cpu(env), tb, true);
+    tb->size = dc->pc - pc_start;
+    tb->icount = num_insns;
 }
 
 void m68k_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index 973c744..a9c5010 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1627,14 +1627,11 @@ static inline void decode(DisasContext *dc, uint32_t ir)
 }
 
 /* generate intermediate code for basic block 'tb'.  */
-static inline void
-gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
-                               bool search_pc)
+void gen_intermediate_code(CPUMBState *env, struct TranslationBlock *tb)
 {
+    MicroBlazeCPU *cpu = mb_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUMBState *env = &cpu->env;
     uint32_t pc_start;
-    int j, lj;
     struct DisasContext ctx;
     struct DisasContext *dc = &ctx;
     uint32_t next_page_start, org_flags;
@@ -1671,7 +1668,6 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
     }
 
     next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
-    lj = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
     if (max_insns == 0) {
@@ -1684,17 +1680,6 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
     gen_tb_start(tb);
     do
     {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j)
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-            }
-            tcg_ctx.gen_opc_pc[lj] = dc->pc;
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-                        tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
@@ -1813,15 +1798,8 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
     }
     gen_tb_end(tb, num_insns);
 
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j)
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-    } else {
-        tb->size = dc->pc - pc_start;
-                tb->icount = num_insns;
-    }
+    tb->size = dc->pc - pc_start;
+    tb->icount = num_insns;
 
 #ifdef DEBUG_DISAS
 #if !SIM_COMPAT
@@ -1838,16 +1816,6 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
     assert(!dc->abort_at_next_insn);
 }
 
-void gen_intermediate_code (CPUMBState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(mb_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc (CPUMBState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(mb_env_get_cpu(env), tb, true);
-}
-
 void mb_cpu_dump_state(CPUState *cs, FILE *f, fprintf_function cpu_fprintf,
                        int flags)
 {
diff --git a/target-mips/translate.c b/target-mips/translate.c
index 56f00e6..897839c 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -1359,9 +1359,6 @@ static TCGv_i32 fpu_fcr0, fpu_fcr31;
 static TCGv_i64 fpu_f64[32];
 static TCGv_i64 msa_wr_d[64];
 
-static uint32_t gen_opc_hflags[OPC_BUF_SIZE];
-static target_ulong gen_opc_btarget[OPC_BUF_SIZE];
-
 #include "exec/gen-icount.h"
 
 #define gen_helper_0e0i(name, arg) do {                           \
@@ -19535,24 +19532,18 @@ static void decode_opc(CPUMIPSState *env, DisasContext *ctx)
     }
 }
 
-static inline void
-gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
-                               bool search_pc)
+void gen_intermediate_code(CPUMIPSState *env, struct TranslationBlock *tb)
 {
+    MIPSCPU *cpu = mips_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUMIPSState *env = &cpu->env;
     DisasContext ctx;
     target_ulong pc_start;
     target_ulong next_page_start;
-    int j, lj = -1;
     int num_insns;
     int max_insns;
     int insn_bytes;
     int is_slot;
 
-    if (search_pc)
-        qemu_log("search pc %d\n", search_pc);
-
     pc_start = tb->pc;
     next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
     ctx.pc = pc_start;
@@ -19596,19 +19587,6 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
     LOG_DISAS("\ntb %p idx %d hflags %04x\n", tb, ctx.mem_idx, ctx.hflags);
     gen_tb_start(tb);
     while (ctx.bstate == BS_NONE) {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j)
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-            }
-            tcg_ctx.gen_opc_pc[lj] = ctx.pc;
-            gen_opc_hflags[lj] = ctx.hflags & MIPS_HFLAG_BMASK;
-            gen_opc_btarget[lj] = ctx.btarget;
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(ctx.pc, ctx.hflags & MIPS_HFLAG_BMASK, ctx.btarget);
         num_insns++;
 
@@ -19709,15 +19687,9 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
 done_generating:
     gen_tb_end(tb, num_insns);
 
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j)
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-    } else {
-        tb->size = ctx.pc - pc_start;
-        tb->icount = num_insns;
-    }
+    tb->size = ctx.pc - pc_start;
+    tb->icount = num_insns;
+
 #ifdef DEBUG_DISAS
     LOG_DISAS("\n");
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -19728,16 +19700,6 @@ done_generating:
 #endif
 }
 
-void gen_intermediate_code (CPUMIPSState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(mips_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc (CPUMIPSState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(mips_env_get_cpu(env), tb, true);
-}
-
 static void fpu_dump_state(CPUMIPSState *env, FILE *f, fprintf_function fpu_fprintf,
                            int flags)
 {
diff --git a/target-moxie/translate.c b/target-moxie/translate.c
index c007764..f84841e 100644
--- a/target-moxie/translate.c
+++ b/target-moxie/translate.c
@@ -815,15 +815,12 @@ static int decode_opc(MoxieCPU *cpu, DisasContext *ctx)
 }
 
 /* generate intermediate code for basic block 'tb'.  */
-static inline void
-gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
-                               bool search_pc)
+void gen_intermediate_code(CPUMoxieState *env, struct TranslationBlock *tb)
 {
+    MoxieCPU *cpu = moxie_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
     DisasContext ctx;
     target_ulong pc_start;
-    int j, lj = -1;
-    CPUMoxieState *env = &cpu->env;
     int num_insns, max_insns;
 
     pc_start = tb->pc;
@@ -844,18 +841,6 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
 
     gen_tb_start(tb);
     do {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j) {
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-                }
-            }
-            tcg_ctx.gen_opc_pc[lj] = ctx.pc;
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(ctx.pc);
         num_insns++;
 
@@ -900,26 +885,8 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
  done_generating:
     gen_tb_end(tb, num_insns);
 
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j) {
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-        }
-    } else {
-        tb->size = ctx.pc - pc_start;
-        tb->icount = num_insns;
-    }
-}
-
-void gen_intermediate_code(CPUMoxieState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(moxie_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUMoxieState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(moxie_env_get_cpu(env), tb, true);
+    tb->size = ctx.pc - pc_start;
+    tb->icount = num_insns;
 }
 
 void restore_state_to_opc(CPUMoxieState *env, TranslationBlock *tb,
diff --git a/target-openrisc/translate.c b/target-openrisc/translate.c
index 26bf87f..b66fde1 100644
--- a/target-openrisc/translate.c
+++ b/target-openrisc/translate.c
@@ -1618,14 +1618,12 @@ static void disas_openrisc_insn(DisasContext *dc, OpenRISCCPU *cpu)
     }
 }
 
-static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
-                                                  TranslationBlock *tb,
-                                                  int search_pc)
+void gen_intermediate_code(CPUOpenRISCState *env, struct TranslationBlock *tb)
 {
+    OpenRISCCPU *cpu = openrisc_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
     struct DisasContext ctx, *dc = &ctx;
     uint32_t pc_start;
-    int j, k;
     uint32_t next_page_start;
     int num_insns;
     int max_insns;
@@ -1647,7 +1645,6 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
     }
 
     next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
-    k = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
 
@@ -1661,18 +1658,6 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
     gen_tb_start(tb);
 
     do {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (k < j) {
-                k++;
-                while (k < j) {
-                    tcg_ctx.gen_opc_instr_start[k++] = 0;
-                }
-            }
-            tcg_ctx.gen_opc_pc[k] = dc->pc;
-            tcg_ctx.gen_opc_instr_start[k] = 1;
-            tcg_ctx.gen_opc_icount[k] = num_insns;
-        }
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
@@ -1746,16 +1731,8 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
 
     gen_tb_end(tb, num_insns);
 
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        k++;
-        while (k <= j) {
-            tcg_ctx.gen_opc_instr_start[k++] = 0;
-        }
-    } else {
-        tb->size = dc->pc - pc_start;
-        tb->icount = num_insns;
-    }
+    tb->size = dc->pc - pc_start;
+    tb->icount = num_insns;
 
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -1767,17 +1744,6 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
 #endif
 }
 
-void gen_intermediate_code(CPUOpenRISCState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(openrisc_env_get_cpu(env), tb, 0);
-}
-
-void gen_intermediate_code_pc(CPUOpenRISCState *env,
-                              struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(openrisc_env_get_cpu(env), tb, 1);
-}
-
 void openrisc_cpu_dump_state(CPUState *cs, FILE *f,
                              fprintf_function cpu_fprintf,
                              int flags)
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 5c2dc38..c2bc1a7 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11409,16 +11409,13 @@ void ppc_cpu_dump_statistics(CPUState *cs, FILE*f,
 }
 
 /*****************************************************************************/
-static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
-                                                  TranslationBlock *tb,
-                                                  bool search_pc)
+void gen_intermediate_code(CPUPPCState *env, struct TranslationBlock *tb)
 {
+    PowerPCCPU *cpu = ppc_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUPPCState *env = &cpu->env;
     DisasContext ctx, *ctxp = &ctx;
     opc_handler_t **table, *handler;
     target_ulong pc_start;
-    int j, lj = -1;
     int num_insns;
     int max_insns;
 
@@ -11486,17 +11483,6 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
     tcg_clear_temp_count();
     /* Set env in case of segfault during code fetch */
     while (ctx.exception == POWERPC_EXCP_NONE && !tcg_op_buf_full()) {
-        if (unlikely(search_pc)) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j)
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-            }
-            tcg_ctx.gen_opc_pc[lj] = ctx.nip;
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(ctx.nip);
         num_insns++;
 
@@ -11598,15 +11584,9 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
     }
     gen_tb_end(tb, num_insns);
 
-    if (unlikely(search_pc)) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j)
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-    } else {
-        tb->size = ctx.nip - pc_start;
-        tb->icount = num_insns;
-    }
+    tb->size = ctx.nip - pc_start;
+    tb->icount = num_insns;
+
 #if defined(DEBUG_DISAS)
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
         int flags;
@@ -11619,16 +11599,6 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
 #endif
 }
 
-void gen_intermediate_code (CPUPPCState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(ppc_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc (CPUPPCState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(ppc_env_get_cpu(env), tb, true);
-}
-
 void restore_state_to_opc(CPUPPCState *env, TranslationBlock *tb,
                           target_ulong *data)
 {
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index 104265d..d4d7c73 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -161,8 +161,6 @@ static char cpu_reg_names[32][4];
 static TCGv_i64 regs[16];
 static TCGv_i64 fregs[16];
 
-static uint8_t gen_opc_cc_op[OPC_BUF_SIZE];
-
 void s390x_translate_init(void)
 {
     int i;
@@ -5319,16 +5317,13 @@ static ExitStatus translate_one(CPUS390XState *env, DisasContext *s)
     return ret;
 }
 
-static inline void gen_intermediate_code_internal(S390CPU *cpu,
-                                                  TranslationBlock *tb,
-                                                  bool search_pc)
+void gen_intermediate_code(CPUS390XState *env, struct TranslationBlock *tb)
 {
+    S390CPU *cpu = s390_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUS390XState *env = &cpu->env;
     DisasContext dc;
     target_ulong pc_start;
     uint64_t next_page_start;
-    int j, lj = -1;
     int num_insns, max_insns;
     ExitStatus status;
     bool do_debug;
@@ -5359,19 +5354,6 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
     gen_tb_start(tb);
 
     do {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j) {
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-                }
-            }
-            tcg_ctx.gen_opc_pc[lj] = dc.pc;
-            gen_opc_cc_op[lj] = dc.cc_op;
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(dc.pc, dc.cc_op);
         num_insns++;
 
@@ -5430,16 +5412,8 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
 
     gen_tb_end(tb, num_insns);
 
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j) {
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-        }
-    } else {
-        tb->size = dc.pc - pc_start;
-        tb->icount = num_insns;
-    }
+    tb->size = dc.pc - pc_start;
+    tb->icount = num_insns;
 
 #if defined(S390X_DEBUG_DISAS)
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -5450,16 +5424,6 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
 #endif
 }
 
-void gen_intermediate_code (CPUS390XState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(s390_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc (CPUS390XState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(s390_env_get_cpu(env), tb, true);
-}
-
 void restore_state_to_opc(CPUS390XState *env, TranslationBlock *tb,
                           target_ulong *data)
 {
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index d3fe1de..f764bc2 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -70,8 +70,6 @@ static TCGv cpu_fregs[32];
 /* internal register indexes */
 static TCGv cpu_flags, cpu_delayed_pc;
 
-static uint32_t gen_opc_hflags[OPC_BUF_SIZE];
-
 #include "exec/gen-icount.h"
 
 void sh4_translate_init(void)
@@ -1816,15 +1814,12 @@ static void decode_opc(DisasContext * ctx)
         gen_store_flags(ctx->flags);
 }
 
-static inline void
-gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
-                               bool search_pc)
+void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
 {
+    SuperHCPU *cpu = sh_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUSH4State *env = &cpu->env;
     DisasContext ctx;
     target_ulong pc_start;
-    int i, ii;
     int num_insns;
     int max_insns;
 
@@ -1841,7 +1836,6 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
     ctx.features = env->features;
     ctx.has_movcal = (ctx.flags & TB_FLAG_PENDING_MOVCA);
 
-    ii = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
     if (max_insns == 0) {
@@ -1853,18 +1847,6 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
 
     gen_tb_start(tb);
     while (ctx.bstate == BS_NONE && !tcg_op_buf_full()) {
-        if (search_pc) {
-            i = tcg_op_buf_count();
-            if (ii < i) {
-                ii++;
-                while (ii < i)
-                    tcg_ctx.gen_opc_instr_start[ii++] = 0;
-            }
-            tcg_ctx.gen_opc_pc[ii] = ctx.pc;
-            gen_opc_hflags[ii] = ctx.flags;
-            tcg_ctx.gen_opc_instr_start[ii] = 1;
-            tcg_ctx.gen_opc_icount[ii] = num_insns;
-        }
         tcg_gen_insn_start(ctx.pc, ctx.flags);
         num_insns++;
 
@@ -1921,15 +1903,8 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
 
     gen_tb_end(tb, num_insns);
 
-    if (search_pc) {
-        i = tcg_op_buf_count();
-        ii++;
-        while (ii <= i)
-            tcg_ctx.gen_opc_instr_start[ii++] = 0;
-    } else {
-        tb->size = ctx.pc - pc_start;
-        tb->icount = num_insns;
-    }
+    tb->size = ctx.pc - pc_start;
+    tb->icount = num_insns;
 
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
@@ -1940,16 +1915,6 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
 #endif
 }
 
-void gen_intermediate_code(CPUSH4State * env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(sh_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUSH4State * env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(sh_env_get_cpu(env), tb, true);
-}
-
 void restore_state_to_opc(CPUSH4State *env, TranslationBlock *tb,
                           target_ulong *data)
 {
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 18344c8..b59742a 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -64,8 +64,6 @@ static TCGv cpu_wim;
 /* Floating point registers */
 static TCGv_i64 cpu_fpr[TARGET_DPREGS];
 
-static target_ulong gen_opc_npc[OPC_BUF_SIZE];
-
 #include "exec/gen-icount.h"
 
 typedef struct DisasContext {
@@ -5208,15 +5206,12 @@ static void disas_sparc_insn(DisasContext * dc, unsigned int insn)
     }
 }
 
-static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
-                                                  TranslationBlock *tb,
-                                                  bool spc)
+void gen_intermediate_code(CPUSPARCState * env, TranslationBlock * tb)
 {
+    SPARCCPU *cpu = sparc_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUSPARCState *env = &cpu->env;
     target_ulong pc_start, last_pc;
     DisasContext dc1, *dc = &dc1;
-    int j, lj = -1;
     int num_insns;
     int max_insns;
     unsigned int insn;
@@ -5245,23 +5240,6 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
 
     gen_tb_start(tb);
     do {
-        if (spc) {
-            qemu_log("Search PC...\n");
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j)
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-                tcg_ctx.gen_opc_pc[lj] = dc->pc;
-                gen_opc_npc[lj] = dc->npc;
-                if (dc->npc & JUMP_PC) {
-                    assert(dc->jump_pc[1] == dc->pc + 4);
-                    gen_opc_npc[lj] = dc->jump_pc[0] | JUMP_PC;
-                }
-                tcg_ctx.gen_opc_instr_start[lj] = 1;
-                tcg_ctx.gen_opc_icount[lj] = num_insns;
-            }
-        }
         if (dc->npc & JUMP_PC) {
             assert(dc->jump_pc[1] == dc->pc + 4);
             tcg_gen_insn_start(dc->pc, dc->jump_pc[0] | JUMP_PC);
@@ -5326,18 +5304,9 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
     }
     gen_tb_end(tb, num_insns);
 
-    if (spc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j)
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-#if 0
-        log_page_dump();
-#endif
-    } else {
-        tb->size = last_pc + 4 - pc_start;
-        tb->icount = num_insns;
-    }
+    tb->size = last_pc + 4 - pc_start;
+    tb->icount = num_insns;
+
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
         qemu_log("--------------\n");
@@ -5348,16 +5317,6 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
 #endif
 }
 
-void gen_intermediate_code(CPUSPARCState * env, TranslationBlock * tb)
-{
-    gen_intermediate_code_internal(sparc_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUSPARCState * env, TranslationBlock * tb)
-{
-    gen_intermediate_code_internal(sparc_env_get_cpu(env), tb, true);
-}
-
 void gen_intermediate_code_init(CPUSPARCState *env)
 {
     unsigned int i;
diff --git a/target-tilegx/translate.c b/target-tilegx/translate.c
index eae5622..ff96165 100644
--- a/target-tilegx/translate.c
+++ b/target-tilegx/translate.c
@@ -2049,17 +2049,14 @@ static void translate_one_bundle(DisasContext *dc, uint64_t bundle)
     }
 }
 
-static inline void gen_intermediate_code_internal(TileGXCPU *cpu,
-                                                  TranslationBlock *tb,
-                                                  bool search_pc)
+void gen_intermediate_code(CPUTLGState *env, struct TranslationBlock *tb)
 {
+    TileGXCPU *cpu = tilegx_env_get_cpu(env);
     DisasContext ctx;
     DisasContext *dc = &ctx;
     CPUState *cs = CPU(cpu);
-    CPUTLGState *env = &cpu->env;
     uint64_t pc_start = tb->pc;
     uint64_t next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
-    int j, lj = -1;
     int num_insns = 0;
     int max_insns = tb->cflags & CF_COUNT_MASK;
 
@@ -2087,18 +2084,6 @@ static inline void gen_intermediate_code_internal(TileGXCPU *cpu,
     gen_tb_start(tb);
 
     while (1) {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j) {
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-                }
-            }
-            tcg_ctx.gen_opc_pc[lj] = dc->pc;
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
@@ -2120,30 +2105,12 @@ static inline void gen_intermediate_code_internal(TileGXCPU *cpu,
     }
 
     gen_tb_end(tb, num_insns);
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j) {
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-        }
-    } else {
-        tb->size = dc->pc - pc_start;
-        tb->icount = num_insns;
-    }
+    tb->size = dc->pc - pc_start;
+    tb->icount = num_insns;
 
     qemu_log_mask(CPU_LOG_TB_IN_ASM, "\n");
 }
 
-void gen_intermediate_code(CPUTLGState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(tilegx_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUTLGState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(tilegx_env_get_cpu(env), tb, true);
-}
-
 void restore_state_to_opc(CPUTLGState *env, TranslationBlock *tb,
                           target_ulong *data)
 {
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index 6f5438f..135c583 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -8266,20 +8266,14 @@ static void decode_opc(CPUTriCoreState *env, DisasContext *ctx, int *is_branch)
     }
 }
 
-static inline void
-gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
-                              int search_pc)
+void gen_intermediate_code(CPUTriCoreState *env, struct TranslationBlock *tb)
 {
+    TriCoreCPU *cpu = tricore_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUTriCoreState *env = &cpu->env;
     DisasContext ctx;
     target_ulong pc_start;
     int num_insns, max_insns;
 
-    if (search_pc) {
-        qemu_log("search pc %d\n", search_pc);
-    }
-
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
     if (max_insns == 0) {
@@ -8318,12 +8312,9 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
     }
 
     gen_tb_end(tb, num_insns);
-    if (search_pc) {
-        printf("done_generating search pc\n");
-    } else {
-        tb->size = ctx.pc - pc_start;
-        tb->icount = num_insns;
-    }
+    tb->size = ctx.pc - pc_start;
+    tb->icount = num_insns;
+
     if (tcg_check_temp_count()) {
         printf("LEAK at %08x\n", env->PC);
     }
@@ -8338,18 +8329,6 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
 }
 
 void
-gen_intermediate_code(CPUTriCoreState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(tricore_env_get_cpu(env), tb, false);
-}
-
-void
-gen_intermediate_code_pc(CPUTriCoreState *env, struct TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(tricore_env_get_cpu(env), tb, true);
-}
-
-void
 restore_state_to_opc(CPUTriCoreState *env, TranslationBlock *tb,
                      target_ulong *data)
 {
diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c
index 9d8167a..48f89fb 100644
--- a/target-unicore32/translate.c
+++ b/target-unicore32/translate.c
@@ -1864,15 +1864,12 @@ static void disas_uc32_insn(CPUUniCore32State *env, DisasContext *s)
 }
 
 /* generate intermediate code in gen_opc_buf and gen_opparam_buf for
-   basic block 'tb'. If search_pc is TRUE, also generate PC
-   information for each intermediate instruction. */
-static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
-        TranslationBlock *tb, bool search_pc)
+   basic block 'tb'.  */
+void gen_intermediate_code(CPUUniCore32State *env, TranslationBlock *tb)
 {
+    UniCore32CPU *cpu = uc32_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUUniCore32State *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
-    int j, lj;
     target_ulong pc_start;
     uint32_t next_page_start;
     int num_insns;
@@ -1894,7 +1891,6 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
     cpu_F0d = tcg_temp_new_i64();
     cpu_F1d = tcg_temp_new_i64();
     next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
-    lj = -1;
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
     if (max_insns == 0) {
@@ -1914,18 +1910,6 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
 
     gen_tb_start(tb);
     do {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j) {
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-                }
-            }
-            tcg_ctx.gen_opc_pc[lj] = dc->pc;
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = num_insns;
-        }
         tcg_gen_insn_start(dc->pc);
         num_insns++;
 
@@ -2039,26 +2023,8 @@ done_generating:
         qemu_log("\n");
     }
 #endif
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        lj++;
-        while (lj <= j) {
-            tcg_ctx.gen_opc_instr_start[lj++] = 0;
-        }
-    } else {
-        tb->size = dc->pc - pc_start;
-        tb->icount = num_insns;
-    }
-}
-
-void gen_intermediate_code(CPUUniCore32State *env, TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(uc32_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUUniCore32State *env, TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(uc32_env_get_cpu(env), tb, true);
+    tb->size = dc->pc - pc_start;
+    tb->icount = num_insns;
 }
 
 static const char *cpu_mode_names[16] = {
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index ac967e1..fda91b7 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -2997,15 +2997,12 @@ static void gen_ibreak_check(CPUXtensaState *env, DisasContext *dc)
     }
 }
 
-static inline
-void gen_intermediate_code_internal(XtensaCPU *cpu,
-                                    TranslationBlock *tb, bool search_pc)
+void gen_intermediate_code(CPUXtensaState *env, TranslationBlock *tb)
 {
+    XtensaCPU *cpu = xtensa_env_get_cpu(env);
     CPUState *cs = CPU(cpu);
-    CPUXtensaState *env = &cpu->env;
     DisasContext dc;
     int insn_count = 0;
-    int j, lj = -1;
     int max_insns = tb->cflags & CF_COUNT_MASK;
     uint32_t pc_start = tb->pc;
     uint32_t next_page_start =
@@ -3049,18 +3046,6 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
     }
 
     do {
-        if (search_pc) {
-            j = tcg_op_buf_count();
-            if (lj < j) {
-                lj++;
-                while (lj < j) {
-                    tcg_ctx.gen_opc_instr_start[lj++] = 0;
-                }
-            }
-            tcg_ctx.gen_opc_pc[lj] = dc.pc;
-            tcg_ctx.gen_opc_instr_start[lj] = 1;
-            tcg_ctx.gen_opc_icount[lj] = insn_count;
-        }
         tcg_gen_insn_start(dc.pc);
         ++insn_count;
 
@@ -3131,24 +3116,8 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
         qemu_log("\n");
     }
 #endif
-    if (search_pc) {
-        j = tcg_op_buf_count();
-        memset(tcg_ctx.gen_opc_instr_start + lj + 1, 0,
-                (j - lj) * sizeof(tcg_ctx.gen_opc_instr_start[0]));
-    } else {
-        tb->size = dc.pc - pc_start;
-        tb->icount = insn_count;
-    }
-}
-
-void gen_intermediate_code(CPUXtensaState *env, TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(xtensa_env_get_cpu(env), tb, false);
-}
-
-void gen_intermediate_code_pc(CPUXtensaState *env, TranslationBlock *tb)
-{
-    gen_intermediate_code_internal(xtensa_env_get_cpu(env), tb, true);
+    tb->size = dc.pc - pc_start;
+    tb->icount = insn_count;
 }
 
 void xtensa_cpu_dump_state(CPUState *cs, FILE *f,
diff --git a/tcg/tcg.h b/tcg/tcg.h
index df499c6..d079a91 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -578,10 +578,6 @@ struct TCGContext {
     TCGOp gen_op_buf[OPC_BUF_SIZE];
     TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
 
-    target_ulong gen_opc_pc[OPC_BUF_SIZE];
-    uint16_t gen_opc_icount[OPC_BUF_SIZE];
-    uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
-
     uint16_t gen_insn_end_off[TCG_MAX_INSNS];
     target_ulong gen_insn_data[TCG_MAX_INSNS][TARGET_INSN_START_WORDS];
 };
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 22/25] tcg: Remove tcg_gen_code_search_pc
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (20 preceding siblings ...)
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 21/25] tcg: Remove gen_intermediate_code_pc Richard Henderson
@ 2015-09-22 20:25 ` Richard Henderson
  2015-09-25 21:11   ` Aurelien Jarno
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 23/25] tcg: Emit prologue to the beginning of code_gen_buffer Richard Henderson
                   ` (2 subsequent siblings)
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

It's no longer used, so tidy up everything reached by it.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/tcg.c | 59 +++++++++++++++++++----------------------------------------
 tcg/tcg.h |  2 --
 2 files changed, 19 insertions(+), 42 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index a0fce5b..8126af9 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2290,12 +2290,28 @@ void tcg_dump_op_count(FILE *f, fprintf_function cpu_fprintf)
 #endif
 
 
-static inline int tcg_gen_code_common(TCGContext *s,
-                                      tcg_insn_unit *gen_code_buf,
-                                      long search_pc)
+int tcg_gen_code(TCGContext *s, tcg_insn_unit *gen_code_buf)
 {
     int i, oi, oi_next, num_insns;
 
+#ifdef CONFIG_PROFILER
+    {
+        int n;
+
+        n = s->gen_last_op_idx + 1;
+        s->op_count += n;
+        if (n > s->op_count_max) {
+            s->op_count_max = n;
+        }
+
+        n = s->nb_temps;
+        s->temp_count += n;
+        if (n > s->temp_count_max) {
+            s->temp_count_max = n;
+        }
+    }
+#endif
+
 #ifdef DEBUG_DISAS
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP))) {
         qemu_log("OP:\n");
@@ -2398,9 +2414,6 @@ static inline int tcg_gen_code_common(TCGContext *s,
             tcg_reg_alloc_op(s, def, opc, args, dead_args, sync_args);
             break;
         }
-        if (search_pc >= 0 && search_pc < tcg_current_code_size(s)) {
-            return oi;
-        }
 #ifndef NDEBUG
         check_regs(s);
 #endif
@@ -2410,30 +2423,6 @@ static inline int tcg_gen_code_common(TCGContext *s,
 
     /* Generate TB finalization at the end of block */
     tcg_out_tb_finalize(s);
-    return -1;
-}
-
-int tcg_gen_code(TCGContext *s, tcg_insn_unit *gen_code_buf)
-{
-#ifdef CONFIG_PROFILER
-    {
-        int n;
-
-        n = s->gen_last_op_idx + 1;
-        s->op_count += n;
-        if (n > s->op_count_max) {
-            s->op_count_max = n;
-        }
-
-        n = s->nb_temps;
-        s->temp_count += n;
-        if (n > s->temp_count_max) {
-            s->temp_count_max = n;
-        }
-    }
-#endif
-
-    tcg_gen_code_common(s, gen_code_buf, -1);
 
     /* flush instruction cache */
     flush_icache_range((uintptr_t)s->code_buf, (uintptr_t)s->code_ptr);
@@ -2441,16 +2430,6 @@ int tcg_gen_code(TCGContext *s, tcg_insn_unit *gen_code_buf)
     return tcg_current_code_size(s);
 }
 
-/* Return the index of the micro operation such as the pc after is <
-   offset bytes from the start of the TB.  The contents of gen_code_buf must
-   not be changed, though writing the same values is ok.
-   Return -1 if not found. */
-int tcg_gen_code_search_pc(TCGContext *s, tcg_insn_unit *gen_code_buf,
-                           long offset)
-{
-    return tcg_gen_code_common(s, gen_code_buf, offset);
-}
-
 #ifdef CONFIG_PROFILER
 void tcg_dump_info(FILE *f, fprintf_function cpu_fprintf)
 {
diff --git a/tcg/tcg.h b/tcg/tcg.h
index d079a91..5fbbd15 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -626,8 +626,6 @@ void tcg_prologue_init(TCGContext *s);
 void tcg_func_start(TCGContext *s);
 
 int tcg_gen_code(TCGContext *s, tcg_insn_unit *gen_code_buf);
-int tcg_gen_code_search_pc(TCGContext *s, tcg_insn_unit *gen_code_buf,
-                           long offset);
 
 void tcg_set_frame(TCGContext *s, int reg, intptr_t start, intptr_t size);
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 23/25] tcg: Emit prologue to the beginning of code_gen_buffer
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (21 preceding siblings ...)
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 22/25] tcg: Remove tcg_gen_code_search_pc Richard Henderson
@ 2015-09-22 20:25 ` Richard Henderson
  2015-09-23 19:28   ` Peter Maydell
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 24/25] tcg: Allocate a guard page after code_gen_buffer Richard Henderson
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 25/25] tcg: Check for overflow via highwater mark Richard Henderson
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

By putting the prologue at the end, we risk overwriting the
prologue should our estimate of maximum TB size.  Given the
two different placements of the call to tcg_prologue_init,
move the high water mark computation into tcg_prologue_init.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/tcg.c       | 25 +++++++++++++++++++------
 translate-all.c | 29 ++++++++++-------------------
 2 files changed, 29 insertions(+), 25 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 8126af9..db4032a 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -363,17 +363,30 @@ void tcg_context_init(TCGContext *s)
 
 void tcg_prologue_init(TCGContext *s)
 {
-    /* init global prologue and epilogue */
-    s->code_buf = s->code_gen_prologue;
-    s->code_ptr = s->code_buf;
+    size_t prologue_size, total_size;
+
+    /* Put the prologue at the beginning of code_gen_buffer.  */
+    s->code_ptr = s->code_buf = s->code_gen_prologue = s->code_gen_buffer;
+
+    /* Generate the prologue.  */
     tcg_target_qemu_prologue(s);
     flush_icache_range((uintptr_t)s->code_buf, (uintptr_t)s->code_ptr);
 
+    /* Deduct the prologue from the buffer.  */
+    prologue_size = tcg_current_code_size(s);
+    s->code_gen_ptr = s->code_gen_buffer = s->code_buf = s->code_ptr;
+
+    /* Compute a high-water mark, at which we voluntarily flush the
+       buffer and start over.  */
+    total_size = s->code_gen_buffer_size -= prologue_size;
+    s->code_gen_buffer_max_size = total_size - TCG_MAX_OP_SIZE * OPC_BUF_SIZE;
+
+    tcg_register_jit(s->code_gen_buffer, total_size);
+
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
-        size_t size = tcg_current_code_size(s);
-        qemu_log("PROLOGUE: [size=%zu]\n", size);
-        log_disas(s->code_buf, size);
+        qemu_log("PROLOGUE: [size=%zu]\n", prologue_size);
+        log_disas(s->code_gen_prologue, prologue_size);
         qemu_log("\n");
         qemu_log_flush();
     }
diff --git a/translate-all.c b/translate-all.c
index f6b8148..4c994bb 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -689,23 +689,16 @@ static inline void code_gen_alloc(size_t tb_size)
     }
 
     qemu_madvise(tcg_ctx.code_gen_buffer, tcg_ctx.code_gen_buffer_size,
-            QEMU_MADV_HUGEPAGE);
-
-    /* Steal room for the prologue at the end of the buffer.  This ensures
-       (via the MAX_CODE_GEN_BUFFER_SIZE limits above) that direct branches
-       from TB's to the prologue are going to be in range.  It also means
-       that we don't need to mark (additional) portions of the data segment
-       as executable.  */
-    tcg_ctx.code_gen_prologue = tcg_ctx.code_gen_buffer +
-            tcg_ctx.code_gen_buffer_size - 1024;
-    tcg_ctx.code_gen_buffer_size -= 1024;
-
-    tcg_ctx.code_gen_buffer_max_size = tcg_ctx.code_gen_buffer_size -
-        (TCG_MAX_OP_SIZE * OPC_BUF_SIZE);
-    tcg_ctx.code_gen_max_blocks = tcg_ctx.code_gen_buffer_size /
-            CODE_GEN_AVG_BLOCK_SIZE;
-    tcg_ctx.tb_ctx.tbs =
-            g_malloc(tcg_ctx.code_gen_max_blocks * sizeof(TranslationBlock));
+                 QEMU_MADV_HUGEPAGE);
+
+    /* Estimate a good size for the number of TBs we can support.  We
+       still haven't deducted the prologue from the buffer size here,
+       but that's minimal and won't affect the estimate much.  */
+    tcg_ctx.code_gen_max_blocks
+        = tcg_ctx.code_gen_buffer_size / CODE_GEN_AVG_BLOCK_SIZE;
+    tcg_ctx.tb_ctx.tbs
+        = g_malloc(tcg_ctx.code_gen_max_blocks * sizeof(TranslationBlock));
+
     qemu_mutex_init(&tcg_ctx.tb_ctx.tb_lock);
 }
 
@@ -716,8 +709,6 @@ void tcg_exec_init(unsigned long tb_size)
 {
     cpu_gen_init();
     code_gen_alloc(tb_size);
-    tcg_ctx.code_gen_ptr = tcg_ctx.code_gen_buffer;
-    tcg_register_jit(tcg_ctx.code_gen_buffer, tcg_ctx.code_gen_buffer_size);
     page_init();
 #if defined(CONFIG_SOFTMMU)
     /* There's no guest base to take into account, so go ahead and
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 24/25] tcg: Allocate a guard page after code_gen_buffer
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (22 preceding siblings ...)
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 23/25] tcg: Emit prologue to the beginning of code_gen_buffer Richard Henderson
@ 2015-09-22 20:25 ` Richard Henderson
  2015-09-23 19:39   ` Peter Maydell
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 25/25] tcg: Check for overflow via highwater mark Richard Henderson
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

This will catch any overflow of the buffer.

Add a native win32 alternative for alloc_code_gen_buffer;
remove the malloc alternative.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 translate-all.c | 210 ++++++++++++++++++++++++++++++++------------------------
 1 file changed, 119 insertions(+), 91 deletions(-)

diff --git a/translate-all.c b/translate-all.c
index 4c994bb..0049927 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -311,31 +311,6 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t retaddr)
     return false;
 }
 
-#ifdef _WIN32
-static __attribute__((unused)) void map_exec(void *addr, long size)
-{
-    DWORD old_protect;
-    VirtualProtect(addr, size,
-                   PAGE_EXECUTE_READWRITE, &old_protect);
-}
-#else
-static __attribute__((unused)) void map_exec(void *addr, long size)
-{
-    unsigned long start, end, page_size;
-
-    page_size = getpagesize();
-    start = (unsigned long)addr;
-    start &= ~(page_size - 1);
-
-    end = (unsigned long)addr + size;
-    end += page_size - 1;
-    end &= ~(page_size - 1);
-
-    mprotect((void *)start, end - start,
-             PROT_READ | PROT_WRITE | PROT_EXEC);
-}
-#endif
-
 void page_size_init(void)
 {
     /* NOTE: we can always suppose that qemu_host_page_size >=
@@ -472,14 +447,6 @@ static inline PageDesc *page_find(tb_page_addr_t index)
 #define USE_STATIC_CODE_GEN_BUFFER
 #endif
 
-/* ??? Should configure for this, not list operating systems here.  */
-#if (defined(__linux__) \
-    || defined(__FreeBSD__) || defined(__FreeBSD_kernel__) \
-    || defined(__DragonFly__) || defined(__OpenBSD__) \
-    || defined(__NetBSD__))
-# define USE_MMAP
-#endif
-
 /* Minimum size of the code gen buffer.  This number is randomly chosen,
    but not so small that we can't have a fair number of TB's live.  */
 #define MIN_CODE_GEN_BUFFER_SIZE     (1024u * 1024)
@@ -567,22 +534,102 @@ static inline void *split_cross_256mb(void *buf1, size_t size1)
 static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
     __attribute__((aligned(CODE_GEN_ALIGN)));
 
+# ifdef _WIN32
+static inline void do_protect(void *addr, long size, int prot)
+{
+    DWORD old_protect;
+    VirtualProtect(addr, size, PAGE_EXECUTE_READWRITE, &old_protect);
+}
+
+static inline void map_exec(void *addr, long size)
+{
+    do_protect(addr, size, PAGE_EXECUTE_READWRITE);
+}
+
+static inline void map_none(void *addr, long size)
+{
+    do_protect(addr, size, PAGE_NOACCESS);
+}
+# else
+static inline void do_protect(void *addr, long size, int prot)
+{
+    uintptr_t start, end;
+
+    start = (uintptr_t)addr;
+    start &= qemu_real_host_page_mask;
+
+    end = (uintptr_t)addr + size;
+    end = ROUND_UP(end, qemu_real_host_page_size);
+
+    mprotect((void *)start, end - start, prot);
+}
+
+static inline void map_exec(void *addr, long size)
+{
+    do_protect(addr, size, PROT_READ | PROT_WRITE | PROT_EXEC);
+}
+
+static inline void map_none(void *addr, long size)
+{
+    do_protect(addr, size, PROT_NONE);
+}
+# endif /* WIN32 */
+
 static inline void *alloc_code_gen_buffer(void)
 {
     void *buf = static_code_gen_buffer;
+    size_t full_size, size;
+
+    /* The size of the buffer, rounded down to end on a page boundary.  */
+    full_size = (((uintptr_t)buf + sizeof(static_code_gen_buffer))
+                 & qemu_real_host_page_mask) - (uintptr_t)buf;
+
+    /* Reserve a guard page.  */
+    size = full_size - qemu_real_host_page_size;
+
+    /* Honor a command-line option limiting the size of the buffer.  */
+    if (size > tcg_ctx.code_gen_buffer_size) {
+        size = (((uintptr_t)buf + tcg_ctx.code_gen_buffer_size)
+                & qemu_real_host_page_mask) - (uintptr_t)buf;
+    }
+    tcg_ctx.code_gen_buffer_size = size;
+
 #ifdef __mips__
-    if (cross_256mb(buf, tcg_ctx.code_gen_buffer_size)) {
-        buf = split_cross_256mb(buf, tcg_ctx.code_gen_buffer_size);
+    if (cross_256mb(buf, size)) {
+        buf = split_cross_256mb(buf, size);
+        size = tcg_ctx.code_gen_buffer_size;
     }
 #endif
-    map_exec(buf, tcg_ctx.code_gen_buffer_size);
+
+    map_exec(buf, size);
+    map_none(buf + size, qemu_real_host_page_size);
+    qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
+
     return buf;
 }
-#elif defined(USE_MMAP)
+#elif defined(_WIN32)
+static inline void *alloc_code_gen_buffer(void)
+{
+    size_t size = tcg_ctx.code_gen_buffer_size;
+    void *buf1, *buf2;
+
+    /* Perform the allocation in two steps, so that the guard page
+       is reserved but uncommitted.  */
+    buf1 = VirtualAlloc(NULL, size + qemu_real_host_page_size,
+                        MEM_RESERVE, PAGE_NOACCESS);
+    if (buf1 != NULL) {
+        buf2 = VirtualAlloc(buf1, size, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
+        assert(buf1 == buf2);
+    }
+
+    return buf1;
+}
+#else
 static inline void *alloc_code_gen_buffer(void)
 {
     int flags = MAP_PRIVATE | MAP_ANONYMOUS;
     uintptr_t start = 0;
+    size_t size = tcg_ctx.code_gen_buffer_size;
     void *buf;
 
     /* Constrain the position of the buffer based on the host cpu.
@@ -598,86 +645,70 @@ static inline void *alloc_code_gen_buffer(void)
        Leave the choice of exact location with the kernel.  */
     flags |= MAP_32BIT;
     /* Cannot expect to map more than 800MB in low memory.  */
-    if (tcg_ctx.code_gen_buffer_size > 800u * 1024 * 1024) {
-        tcg_ctx.code_gen_buffer_size = 800u * 1024 * 1024;
+    if (size > 800u * 1024 * 1024) {
+        tcg_ctx.code_gen_buffer_size = size = 800u * 1024 * 1024;
     }
 # elif defined(__sparc__)
     start = 0x40000000ul;
 # elif defined(__s390x__)
     start = 0x90000000ul;
 # elif defined(__mips__)
-    /* ??? We ought to more explicitly manage layout for softmmu too.  */
-#  ifdef CONFIG_USER_ONLY
-    start = 0x68000000ul;
-#  elif _MIPS_SIM == _ABI64
+#  if _MIPS_SIM == _ABI64
     start = 0x128000000ul;
 #  else
     start = 0x08000000ul;
 #  endif
 # endif
 
-    buf = mmap((void *)start, tcg_ctx.code_gen_buffer_size,
-               PROT_WRITE | PROT_READ | PROT_EXEC, flags, -1, 0);
+    buf = mmap((void *)start, size + qemu_real_host_page_size,
+               PROT_NONE, flags, -1, 0);
     if (buf == MAP_FAILED) {
         return NULL;
     }
 
 #ifdef __mips__
-    if (cross_256mb(buf, tcg_ctx.code_gen_buffer_size)) {
+    if (cross_256mb(buf, size)) {
         /* Try again, with the original still mapped, to avoid re-acquiring
            that 256mb crossing.  This time don't specify an address.  */
-        size_t size2, size1 = tcg_ctx.code_gen_buffer_size;
-        void *buf2 = mmap(NULL, size1, PROT_WRITE | PROT_READ | PROT_EXEC,
-                          flags, -1, 0);
-        if (buf2 != MAP_FAILED) {
-            if (!cross_256mb(buf2, size1)) {
+        size_t size2;
+        void *buf2 = mmap(NULL, size + qemu_real_host_page_size,
+                          PROT_NONE, flags, -1, 0);
+        switch (buf2 != MAP_FAILED) {
+        case 1:
+            if (!cross_256mb(buf2, size)) {
                 /* Success!  Use the new buffer.  */
-                munmap(buf, size1);
-                return buf2;
+                munmap(buf, size);
+                break;
             }
             /* Failure.  Work with what we had.  */
-            munmap(buf2, size1);
+            munmap(buf2, size);
+            /* fallthru */
+        default:
+            /* Split the original buffer.  Free the smaller half.  */
+            buf2 = split_cross_256mb(buf, size);
+            size2 = tcg_ctx.code_gen_buffer_size;
+            if (buf == buf2) {
+                munmap(buf + size2 + qemu_real_host_page_size, size - size2);
+            } else {
+                munmap(buf, size - size2);
+            }
+            size = size2;
+            break;
         }
-
-        /* Split the original buffer.  Free the smaller half.  */
-        buf2 = split_cross_256mb(buf, size1);
-        size2 = tcg_ctx.code_gen_buffer_size;
-        munmap(buf + (buf == buf2 ? size2 : 0), size1 - size2);
-        return buf2;
+        buf = buf2;
     }
 #endif
 
-    return buf;
-}
-#else
-static inline void *alloc_code_gen_buffer(void)
-{
-    void *buf = g_try_malloc(tcg_ctx.code_gen_buffer_size);
+    /* Make the final buffer accessable.  The guard page at the end
+       will remain inaccessable with PROT_NONE.  */
+    mprotect(buf, size, PROT_WRITE | PROT_READ | PROT_EXEC);
 
-    if (buf == NULL) {
-        return NULL;
-    }
+    /* Request large pages for the buffer.  */
+    qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
 
-#ifdef __mips__
-    if (cross_256mb(buf, tcg_ctx.code_gen_buffer_size)) {
-        void *buf2 = g_malloc(tcg_ctx.code_gen_buffer_size);
-        if (buf2 != NULL && !cross_256mb(buf2, size1)) {
-            /* Success!  Use the new buffer.  */
-            free(buf);
-            buf = buf2;
-        } else {
-            /* Failure.  Work with what we had.  Since this is malloc
-               and not mmap, we can't free the other half.  */
-            free(buf2);
-            buf = split_cross_256mb(buf, tcg_ctx.code_gen_buffer_size);
-        }
-    }
-#endif
-
-    map_exec(buf, tcg_ctx.code_gen_buffer_size);
     return buf;
 }
-#endif /* USE_STATIC_CODE_GEN_BUFFER, USE_MMAP */
+#endif /* USE_STATIC_CODE_GEN_BUFFER, WIN32, POSIX */
 
 static inline void code_gen_alloc(size_t tb_size)
 {
@@ -688,9 +719,6 @@ static inline void code_gen_alloc(size_t tb_size)
         exit(1);
     }
 
-    qemu_madvise(tcg_ctx.code_gen_buffer, tcg_ctx.code_gen_buffer_size,
-                 QEMU_MADV_HUGEPAGE);
-
     /* Estimate a good size for the number of TBs we can support.  We
        still haven't deducted the prologue from the buffer size here,
        but that's minimal and won't affect the estimate much.  */
@@ -708,8 +736,8 @@ static inline void code_gen_alloc(size_t tb_size)
 void tcg_exec_init(unsigned long tb_size)
 {
     cpu_gen_init();
-    code_gen_alloc(tb_size);
     page_init();
+    code_gen_alloc(tb_size);
 #if defined(CONFIG_SOFTMMU)
     /* There's no guest base to take into account, so go ahead and
        initialize the prologue now.  */
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v3 25/25] tcg: Check for overflow via highwater mark
  2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
                   ` (23 preceding siblings ...)
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 24/25] tcg: Allocate a guard page after code_gen_buffer Richard Henderson
@ 2015-09-22 20:25 ` Richard Henderson
  2015-09-23 19:42   ` Peter Maydell
  24 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-22 20:25 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell, alex.bennee, aurelien

We currently pre-compute an worst case code size for any TB, which
works out to be 122kB.  Since the average TB size is near 1kB, this
wastes quite a lot of storage.

Instead, check for overflow in between generating code for each opcode.
The overhead of the check isn't measurable and wastage is minimized.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 include/exec/exec-all.h |  6 ------
 tcg/tcg.c               | 16 ++++++++++++----
 tcg/tcg.h               |  6 ++++--
 translate-all.c         | 31 ++++++++++++++++++++++++++-----
 4 files changed, 42 insertions(+), 17 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 6871e78..71c9d85 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -62,12 +62,6 @@ typedef struct TranslationBlock TranslationBlock;
 #define OPC_BUF_SIZE 640
 #define OPC_MAX_SIZE (OPC_BUF_SIZE - MAX_OP_PER_INSTR)
 
-/* Maximum size a TCG op can expand to.  This is complicated because a
-   single op may require several host instructions and register reloads.
-   For now take a wild guess at 192 bytes, which should allow at least
-   a couple of fixup instructions per argument.  */
-#define TCG_MAX_OP_SIZE 192
-
 #define OPPARAM_BUF_SIZE (OPC_BUF_SIZE * MAX_OPC_PARAM)
 
 #include "qemu/log.h"
diff --git a/tcg/tcg.c b/tcg/tcg.c
index db4032a..750b977 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -375,11 +375,12 @@ void tcg_prologue_init(TCGContext *s)
     /* Deduct the prologue from the buffer.  */
     prologue_size = tcg_current_code_size(s);
     s->code_gen_ptr = s->code_gen_buffer = s->code_buf = s->code_ptr;
-
-    /* Compute a high-water mark, at which we voluntarily flush the
-       buffer and start over.  */
     total_size = s->code_gen_buffer_size -= prologue_size;
-    s->code_gen_buffer_max_size = total_size - TCG_MAX_OP_SIZE * OPC_BUF_SIZE;
+
+    /* Compute a high-water mark, at which we voluntarily flush the buffer
+       and start over.  The size here is arbitrary, significantly larger
+       than we expect the code generation for any one opcode to require.  */
+    s->code_gen_highwater = s->code_gen_buffer + (total_size - 1024);
 
     tcg_register_jit(s->code_gen_buffer, total_size);
 
@@ -2430,6 +2431,13 @@ int tcg_gen_code(TCGContext *s, tcg_insn_unit *gen_code_buf)
 #ifndef NDEBUG
         check_regs(s);
 #endif
+        /* Test for (pending) buffer overflow.  The assumption is that any
+           one operation beginning below the high water mark cannot overrun
+           the buffer completely.  Thus we can test for overfow after
+           generating code without having to check during generation.  */
+        if (unlikely(s->code_gen_ptr > s->code_gen_highwater)) {
+            return -1;
+        }
     }
     tcg_debug_assert(num_insns >= 0);
     s->gen_insn_end_off[num_insns] = tcg_current_code_size(s);
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 5fbbd15..be95b98 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -559,10 +559,12 @@ struct TCGContext {
     void *code_gen_prologue;
     void *code_gen_buffer;
     size_t code_gen_buffer_size;
-    /* threshold to flush the translated code buffer */
-    size_t code_gen_buffer_max_size;
     void *code_gen_ptr;
 
+    /* Threshold to flush the translated code buffer, and where to go
+       upon overflow.  */
+    void *code_gen_highwater;
+
     TBContext tb_ctx;
 
     /* The TCGBackendData structure is private to tcg-target.c.  */
diff --git a/translate-all.c b/translate-all.c
index 0049927..5ad0a61 100644
--- a/translate-all.c
+++ b/translate-all.c
@@ -222,6 +222,7 @@ static target_long decode_sleb128(uint8_t **pp)
 
 static int encode_search(TranslationBlock *tb, uint8_t *block)
 {
+    uint8_t *highwater = tcg_ctx.code_gen_highwater;
     uint8_t *p = block;
     int i, j, n;
 
@@ -240,6 +241,14 @@ static int encode_search(TranslationBlock *tb, uint8_t *block)
         }
         prev = (i == 0 ? 0 : tcg_ctx.gen_insn_end_off[i - 1]);
         p = encode_sleb128(p, tcg_ctx.gen_insn_end_off[i] - prev);
+
+        /* Test for (pending) buffer overflow.  The assumption is that any
+           one row beginning below the high water mark cannot overrun
+           the buffer completely.  Thus we can test for overfow after
+           encoding a row without having to check during encoding.  */
+        if (unlikely(p > highwater)) {
+            return -1;
+        }
     }
 
     return p - block;
@@ -756,9 +765,7 @@ static TranslationBlock *tb_alloc(target_ulong pc)
 {
     TranslationBlock *tb;
 
-    if (tcg_ctx.tb_ctx.nb_tbs >= tcg_ctx.code_gen_max_blocks ||
-        (tcg_ctx.code_gen_ptr - tcg_ctx.code_gen_buffer) >=
-         tcg_ctx.code_gen_buffer_max_size) {
+    if (tcg_ctx.tb_ctx.nb_tbs >= tcg_ctx.code_gen_max_blocks) {
         return NULL;
     }
     tb = &tcg_ctx.tb_ctx.tbs[tcg_ctx.tb_ctx.nb_tbs++];
@@ -1063,12 +1070,15 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     if (use_icount) {
         cflags |= CF_USE_ICOUNT;
     }
+
     tb = tb_alloc(pc);
-    if (!tb) {
+    if (unlikely(!tb)) {
+ buffer_overflow:
         /* flush must be done */
         tb_flush(cpu);
         /* cannot fail at this point */
         tb = tb_alloc(pc);
+        assert(tb != NULL);
         /* Don't forget to invalidate previous TB info.  */
         tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
     }
@@ -1109,8 +1119,19 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
     tcg_ctx.code_time -= profile_getclock();
 #endif
 
+    /* ??? Overflow could be handled better here.  In particular, we
+       don't need to re-do gen_intermediate_code, nor should we re-do
+       the tcg optimization currently hidden inside tcg_gen_code.  All
+       that should be required is to flush the TBs, allocate a new TB,
+       re-initialize it per above, and re-do the actual code generation.  */
     gen_code_size = tcg_gen_code(&tcg_ctx, gen_code_buf);
+    if (unlikely(gen_code_size < 0)) {
+        goto buffer_overflow;
+    }
     search_size = encode_search(tb, (void *)gen_code_buf + gen_code_size);
+    if (unlikely(search_size < 0)) {
+        goto buffer_overflow;
+    }
 
 #ifdef CONFIG_PROFILER
     tcg_ctx.code_time += profile_getclock();
@@ -1681,7 +1702,7 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf)
     cpu_fprintf(f, "Translation buffer state:\n");
     cpu_fprintf(f, "gen code size       %td/%zd\n",
                 tcg_ctx.code_gen_ptr - tcg_ctx.code_gen_buffer,
-                tcg_ctx.code_gen_buffer_max_size);
+                tcg_ctx.code_gen_highwater - tcg_ctx.code_gen_buffer);
     cpu_fprintf(f, "TB count            %d/%d\n",
             tcg_ctx.tb_ctx.nb_tbs, tcg_ctx.code_gen_max_blocks);
     cpu_fprintf(f, "TB avg target size  %d max=%d bytes\n",
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 12/25] target-sparc: Tidy gen_branch_a interface
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 12/25] target-sparc: Tidy gen_branch_a interface Richard Henderson
@ 2015-09-22 21:23   ` Aurelien Jarno
  2015-09-24 19:42   ` Aurelien Jarno
  1 sibling, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2015-09-22 21:23 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, alex.bennee, qemu-devel

On 2015-09-22 13:24, Richard Henderson wrote:
> We always pass pc2 == dc->npc and r_cond == cpu_cond,
> and always set is_br afterward.  Infer all of that.
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-sparc/translate.c | 21 ++++++++++-----------
>  1 file changed, 10 insertions(+), 11 deletions(-)

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/25] tcg: Allow extra data to be attached to insn_start
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 05/25] tcg: Allow extra data to be attached to insn_start Richard Henderson
@ 2015-09-23 14:55   ` Kevin O'Connor
  2015-09-23 16:37     ` Richard Henderson
  2015-09-23 16:38     ` Richard Henderson
  0 siblings, 2 replies; 53+ messages in thread
From: Kevin O'Connor @ 2015-09-23 14:55 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, alex.bennee, qemu-devel, aurelien

On Tue, Sep 22, 2015 at 01:24:47PM -0700, Richard Henderson wrote:
> With an eye toward having this data replace the gen_opc_* arrays
> that each target collects in order to enable restore_state_from_tb.

Hi Richard,

Instead of having each architecture front-end determine the constants
to be restored on an exception, have you considered having the tcg
liveness pass automatically detect them?

What I was thinking was if:
- each front-end stored each constant on every instruction using
  regular "movi" ops
- tcg_liveness_analysis() tracked which global memory "sync" writes
  are purely due to an op that can raise an exception
- then tcg_liveness_analysis() could remove "movi" instructions with
  outputs that are needed only during an exception and place the
  constant directly in the compressed table itself.

I'm curious if this was considered and if there is a reason it
wouldn't work well.

-Kevin

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/25] tcg: Allow extra data to be attached to insn_start
  2015-09-23 14:55   ` Kevin O'Connor
@ 2015-09-23 16:37     ` Richard Henderson
  2015-09-23 16:38     ` Richard Henderson
  1 sibling, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-23 16:37 UTC (permalink / raw)
  To: Kevin O'Connor; +Cc: peter.maydell, alex.bennee, qemu-devel, aurelien

On 09/23/2015 07:55 AM, Kevin O'Connor wrote:
> On Tue, Sep 22, 2015 at 01:24:47PM -0700, Richard Henderson wrote:
>> With an eye toward having this data replace the gen_opc_* arrays
>> that each target collects in order to enable restore_state_from_tb.
> 
> Hi Richard,
> 
> Instead of having each architecture front-end determine the constants
> to be restored on an exception, have you considered having the tcg
> liveness pass automatically detect them?
> 
> What I was thinking was if:
> - each front-end stored each constant on every instruction using
>   regular "movi" ops
> - tcg_liveness_analysis() tracked which global memory "sync" writes
>   are purely due to an op that can raise an exception
> - then tcg_liveness_analysis() could remove "movi" instructions with
>   outputs that are needed only during an exception and place the
>   constant directly in the compressed table itself.
> 
> I'm curious if this was considered and if there is a reason it
> wouldn't work well.

We certainly don't have enough information to infer something like that.

The moment we reach a helper that isn't marked as TCG_CALL_NO_WG, all that
inference has to go out the window ans we have to assume that the "movi op" is
both necessary and overwritten.

The knowledge of which helpers modify a field such as cc_op is present into the
translators in code form.  It would require significant effort to change that.


r~

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 05/25] tcg: Allow extra data to be attached to insn_start
  2015-09-23 14:55   ` Kevin O'Connor
  2015-09-23 16:37     ` Richard Henderson
@ 2015-09-23 16:38     ` Richard Henderson
  1 sibling, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-23 16:38 UTC (permalink / raw)
  To: Kevin O'Connor; +Cc: peter.maydell, alex.bennee, qemu-devel, aurelien

On 09/23/2015 07:55 AM, Kevin O'Connor wrote:
> On Tue, Sep 22, 2015 at 01:24:47PM -0700, Richard Henderson wrote:
>> With an eye toward having this data replace the gen_opc_* arrays
>> that each target collects in order to enable restore_state_from_tb.
> 
> Hi Richard,
> 
> Instead of having each architecture front-end determine the constants
> to be restored on an exception, have you considered having the tcg
> liveness pass automatically detect them?
> 
> What I was thinking was if:
> - each front-end stored each constant on every instruction using
>   regular "movi" ops
> - tcg_liveness_analysis() tracked which global memory "sync" writes
>   are purely due to an op that can raise an exception
> - then tcg_liveness_analysis() could remove "movi" instructions with
>   outputs that are needed only during an exception and place the
>   constant directly in the compressed table itself.
> 
> I'm curious if this was considered and if there is a reason it
> wouldn't work well.

We certainly don't have enough information to infer something like that.

The moment we reach a helper that isn't marked as TCG_CALL_NO_WG, all that
inference has to go out the window ans we have to assume that the "movi op" is
both necessary and overwritten.

The knowledge of which helpers modify a field such as cc_op is baked into the
translators at a different level.


r~

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 04/25] target-*: Introduce and use cpu_breakpoint_test
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 04/25] target-*: Introduce and use cpu_breakpoint_test Richard Henderson
@ 2015-09-23 19:19   ` Peter Maydell
  0 siblings, 0 replies; 53+ messages in thread
From: Peter Maydell @ 2015-09-23 19:19 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Alex Bennée, QEMU Developers, Aurelien Jarno

On 22 September 2015 at 13:24, Richard Henderson <rth@twiddle.net> wrote:
> Reduce the boilerplate required for each target.  At the same time,
> move the test for breakpoint after calling tcg_gen_insn_start.
>
> Note that arm and aarch64 do not use cpu_breakpoint_test, but still
> move the inline test down after tcg_gen_insn_start.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 20/25] tcg: Save insn data and use it in cpu_restore_state_from_tb
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 20/25] tcg: Save insn data and use it in cpu_restore_state_from_tb Richard Henderson
@ 2015-09-23 19:20   ` Peter Maydell
  2015-09-25 21:10   ` Aurelien Jarno
  1 sibling, 0 replies; 53+ messages in thread
From: Peter Maydell @ 2015-09-23 19:20 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Alex Bennée, QEMU Developers, Aurelien Jarno

On 22 September 2015 at 13:25, Richard Henderson <rth@twiddle.net> wrote:
> We can now restore state without retranslation.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  include/exec/exec-all.h |   1 +
>  tcg/tcg.c               |  40 ++++++++-----
>  tcg/tcg.h               |   4 +-
>  translate-all.c         | 149 +++++++++++++++++++++++++++++++++++-------------
>  4 files changed, 139 insertions(+), 55 deletions(-)

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 23/25] tcg: Emit prologue to the beginning of code_gen_buffer
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 23/25] tcg: Emit prologue to the beginning of code_gen_buffer Richard Henderson
@ 2015-09-23 19:28   ` Peter Maydell
  2015-09-23 19:39     ` Richard Henderson
  0 siblings, 1 reply; 53+ messages in thread
From: Peter Maydell @ 2015-09-23 19:28 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Alex Bennée, QEMU Developers, Aurelien Jarno

On 22 September 2015 at 13:25, Richard Henderson <rth@twiddle.net> wrote:
> By putting the prologue at the end, we risk overwriting the
> prologue should our estimate of maximum TB size.  Given the
> two different placements of the call to tcg_prologue_init,
> move the high water mark computation into tcg_prologue_init.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/tcg.c       | 25 +++++++++++++++++++------
>  translate-all.c | 29 ++++++++++-------------------
>  2 files changed, 29 insertions(+), 25 deletions(-)
>
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index 8126af9..db4032a 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -363,17 +363,30 @@ void tcg_context_init(TCGContext *s)
>
>  void tcg_prologue_init(TCGContext *s)
>  {
> -    /* init global prologue and epilogue */
> -    s->code_buf = s->code_gen_prologue;
> -    s->code_ptr = s->code_buf;
> +    size_t prologue_size, total_size;
> +
> +    /* Put the prologue at the beginning of code_gen_buffer.  */
> +    s->code_ptr = s->code_buf = s->code_gen_prologue = s->code_gen_buffer;
> +
> +    /* Generate the prologue.  */
>      tcg_target_qemu_prologue(s);
>      flush_icache_range((uintptr_t)s->code_buf, (uintptr_t)s->code_ptr);
>
> +    /* Deduct the prologue from the buffer.  */
> +    prologue_size = tcg_current_code_size(s);
> +    s->code_gen_ptr = s->code_gen_buffer = s->code_buf = s->code_ptr;
> +
> +    /* Compute a high-water mark, at which we voluntarily flush the
> +       buffer and start over.  */
> +    total_size = s->code_gen_buffer_size -= prologue_size;

-= on the RHS of an assignment seems unnecessarily tricky to me;
I think splitting this into two lines would be clearer.

> +    s->code_gen_buffer_max_size = total_size - TCG_MAX_OP_SIZE * OPC_BUF_SIZE;
> +
> +    tcg_register_jit(s->code_gen_buffer, total_size);
> +
>  #ifdef DEBUG_DISAS
>      if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM)) {
> -        size_t size = tcg_current_code_size(s);
> -        qemu_log("PROLOGUE: [size=%zu]\n", size);
> -        log_disas(s->code_buf, size);
> +        qemu_log("PROLOGUE: [size=%zu]\n", prologue_size);
> +        log_disas(s->code_gen_prologue, prologue_size);
>          qemu_log("\n");
>          qemu_log_flush();
>      }
> diff --git a/translate-all.c b/translate-all.c
> index f6b8148..4c994bb 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -689,23 +689,16 @@ static inline void code_gen_alloc(size_t tb_size)
>      }
>
>      qemu_madvise(tcg_ctx.code_gen_buffer, tcg_ctx.code_gen_buffer_size,
> -            QEMU_MADV_HUGEPAGE);
> -
> -    /* Steal room for the prologue at the end of the buffer.  This ensures
> -       (via the MAX_CODE_GEN_BUFFER_SIZE limits above) that direct branches
> -       from TB's to the prologue are going to be in range.  It also means
> -       that we don't need to mark (additional) portions of the data segment
> -       as executable.  */

I take it we don't have any annoying targets with branch-forwards
ranges larger than their branch-backwards ranges...

> -    tcg_ctx.code_gen_prologue = tcg_ctx.code_gen_buffer +
> -            tcg_ctx.code_gen_buffer_size - 1024;
> -    tcg_ctx.code_gen_buffer_size -= 1024;
> -
> -    tcg_ctx.code_gen_buffer_max_size = tcg_ctx.code_gen_buffer_size -
> -        (TCG_MAX_OP_SIZE * OPC_BUF_SIZE);
> -    tcg_ctx.code_gen_max_blocks = tcg_ctx.code_gen_buffer_size /
> -            CODE_GEN_AVG_BLOCK_SIZE;
> -    tcg_ctx.tb_ctx.tbs =
> -            g_malloc(tcg_ctx.code_gen_max_blocks * sizeof(TranslationBlock));
> +                 QEMU_MADV_HUGEPAGE);
> +
> +    /* Estimate a good size for the number of TBs we can support.  We
> +       still haven't deducted the prologue from the buffer size here,
> +       but that's minimal and won't affect the estimate much.  */
> +    tcg_ctx.code_gen_max_blocks
> +        = tcg_ctx.code_gen_buffer_size / CODE_GEN_AVG_BLOCK_SIZE;
> +    tcg_ctx.tb_ctx.tbs
> +        = g_malloc(tcg_ctx.code_gen_max_blocks * sizeof(TranslationBlock));

Prefer g_new(TranslationBlock, tcg_ctx.code_gen_max_blocks).

> +
>      qemu_mutex_init(&tcg_ctx.tb_ctx.tb_lock);
>  }
>
> @@ -716,8 +709,6 @@ void tcg_exec_init(unsigned long tb_size)
>  {
>      cpu_gen_init();
>      code_gen_alloc(tb_size);
> -    tcg_ctx.code_gen_ptr = tcg_ctx.code_gen_buffer;
> -    tcg_register_jit(tcg_ctx.code_gen_buffer, tcg_ctx.code_gen_buffer_size);
>      page_init();
>  #if defined(CONFIG_SOFTMMU)
>      /* There's no guest base to take into account, so go ahead and
> --
> 2.4.3
>

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 24/25] tcg: Allocate a guard page after code_gen_buffer
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 24/25] tcg: Allocate a guard page after code_gen_buffer Richard Henderson
@ 2015-09-23 19:39   ` Peter Maydell
  2015-09-23 20:00     ` Richard Henderson
  0 siblings, 1 reply; 53+ messages in thread
From: Peter Maydell @ 2015-09-23 19:39 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Alex Bennée, QEMU Developers, Aurelien Jarno

On 22 September 2015 at 13:25, Richard Henderson <rth@twiddle.net> wrote:
> This will catch any overflow of the buffer.
>
> Add a native win32 alternative for alloc_code_gen_buffer;
> remove the malloc alternative.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  translate-all.c | 210 ++++++++++++++++++++++++++++++++------------------------
>  1 file changed, 119 insertions(+), 91 deletions(-)
>
> diff --git a/translate-all.c b/translate-all.c
> index 4c994bb..0049927 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -311,31 +311,6 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t retaddr)
>      return false;
>  }
>
> -#ifdef _WIN32
> -static __attribute__((unused)) void map_exec(void *addr, long size)
> -{
> -    DWORD old_protect;
> -    VirtualProtect(addr, size,
> -                   PAGE_EXECUTE_READWRITE, &old_protect);
> -}
> -#else
> -static __attribute__((unused)) void map_exec(void *addr, long size)
> -{
> -    unsigned long start, end, page_size;
> -
> -    page_size = getpagesize();
> -    start = (unsigned long)addr;
> -    start &= ~(page_size - 1);
> -
> -    end = (unsigned long)addr + size;
> -    end += page_size - 1;
> -    end &= ~(page_size - 1);
> -
> -    mprotect((void *)start, end - start,
> -             PROT_READ | PROT_WRITE | PROT_EXEC);
> -}
> -#endif
> -
>  void page_size_init(void)
>  {
>      /* NOTE: we can always suppose that qemu_host_page_size >=
> @@ -472,14 +447,6 @@ static inline PageDesc *page_find(tb_page_addr_t index)
>  #define USE_STATIC_CODE_GEN_BUFFER
>  #endif
>
> -/* ??? Should configure for this, not list operating systems here.  */
> -#if (defined(__linux__) \
> -    || defined(__FreeBSD__) || defined(__FreeBSD_kernel__) \
> -    || defined(__DragonFly__) || defined(__OpenBSD__) \
> -    || defined(__NetBSD__))
> -# define USE_MMAP
> -#endif
> -
>  /* Minimum size of the code gen buffer.  This number is randomly chosen,
>     but not so small that we can't have a fair number of TB's live.  */
>  #define MIN_CODE_GEN_BUFFER_SIZE     (1024u * 1024)
> @@ -567,22 +534,102 @@ static inline void *split_cross_256mb(void *buf1, size_t size1)
>  static uint8_t static_code_gen_buffer[DEFAULT_CODE_GEN_BUFFER_SIZE]
>      __attribute__((aligned(CODE_GEN_ALIGN)));
>
> +# ifdef _WIN32

Why the space before ifdef here ?

> +static inline void do_protect(void *addr, long size, int prot)
> +{
> +    DWORD old_protect;
> +    VirtualProtect(addr, size, PAGE_EXECUTE_READWRITE, &old_protect);

The 'prot' argument isn't used -- did you mean to pass it
in as VirtualProtect argument 3 ?

> +}
> +
> +static inline void map_exec(void *addr, long size)
> +{
> +    do_protect(addr, size, PAGE_EXECUTE_READWRITE);
> +}
> +
> +static inline void map_none(void *addr, long size)
> +{
> +    do_protect(addr, size, PAGE_NOACCESS);
> +}
> +# else
> +static inline void do_protect(void *addr, long size, int prot)
> +{
> +    uintptr_t start, end;
> +
> +    start = (uintptr_t)addr;
> +    start &= qemu_real_host_page_mask;
> +
> +    end = (uintptr_t)addr + size;
> +    end = ROUND_UP(end, qemu_real_host_page_size);
> +
> +    mprotect((void *)start, end - start, prot);
> +}
> +
> +static inline void map_exec(void *addr, long size)
> +{
> +    do_protect(addr, size, PROT_READ | PROT_WRITE | PROT_EXEC);
> +}
> +
> +static inline void map_none(void *addr, long size)
> +{
> +    do_protect(addr, size, PROT_NONE);
> +}
> +# endif /* WIN32 */
> +
>  static inline void *alloc_code_gen_buffer(void)
>  {
>      void *buf = static_code_gen_buffer;
> +    size_t full_size, size;
> +
> +    /* The size of the buffer, rounded down to end on a page boundary.  */
> +    full_size = (((uintptr_t)buf + sizeof(static_code_gen_buffer))
> +                 & qemu_real_host_page_mask) - (uintptr_t)buf;
> +
> +    /* Reserve a guard page.  */
> +    size = full_size - qemu_real_host_page_size;
> +
> +    /* Honor a command-line option limiting the size of the buffer.  */
> +    if (size > tcg_ctx.code_gen_buffer_size) {
> +        size = (((uintptr_t)buf + tcg_ctx.code_gen_buffer_size)
> +                & qemu_real_host_page_mask) - (uintptr_t)buf;
> +    }
> +    tcg_ctx.code_gen_buffer_size = size;
> +
>  #ifdef __mips__
> -    if (cross_256mb(buf, tcg_ctx.code_gen_buffer_size)) {
> -        buf = split_cross_256mb(buf, tcg_ctx.code_gen_buffer_size);
> +    if (cross_256mb(buf, size)) {
> +        buf = split_cross_256mb(buf, size);
> +        size = tcg_ctx.code_gen_buffer_size;
>      }
>  #endif
> -    map_exec(buf, tcg_ctx.code_gen_buffer_size);
> +
> +    map_exec(buf, size);
> +    map_none(buf + size, qemu_real_host_page_size);
> +    qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);

I think we're now doing the MADV_HUGEPAGE over "buffer size
minus a page" rather than "buffer size". Does that mean
we've gone from doing the madvise on a whole number of
hugepages to doing it on something that's not a whole number
of hugepages, and if so does the kernel decide not to use
hugepages here?

(aka, should we make the buffer size we allocate size + a
guard page, rather than taking the guard page out of the size?)


> +
>      return buf;
>  }
> -#elif defined(USE_MMAP)
> +#elif defined(_WIN32)
> +static inline void *alloc_code_gen_buffer(void)
> +{
> +    size_t size = tcg_ctx.code_gen_buffer_size;
> +    void *buf1, *buf2;
> +
> +    /* Perform the allocation in two steps, so that the guard page
> +       is reserved but uncommitted.  */
> +    buf1 = VirtualAlloc(NULL, size + qemu_real_host_page_size,
> +                        MEM_RESERVE, PAGE_NOACCESS);
> +    if (buf1 != NULL) {
> +        buf2 = VirtualAlloc(buf1, size, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
> +        assert(buf1 == buf2);
> +    }
> +
> +    return buf1;
> +}
> +#else
>  static inline void *alloc_code_gen_buffer(void)
>  {
>      int flags = MAP_PRIVATE | MAP_ANONYMOUS;
>      uintptr_t start = 0;
> +    size_t size = tcg_ctx.code_gen_buffer_size;
>      void *buf;
>
>      /* Constrain the position of the buffer based on the host cpu.
> @@ -598,86 +645,70 @@ static inline void *alloc_code_gen_buffer(void)
>         Leave the choice of exact location with the kernel.  */
>      flags |= MAP_32BIT;
>      /* Cannot expect to map more than 800MB in low memory.  */
> -    if (tcg_ctx.code_gen_buffer_size > 800u * 1024 * 1024) {
> -        tcg_ctx.code_gen_buffer_size = 800u * 1024 * 1024;
> +    if (size > 800u * 1024 * 1024) {
> +        tcg_ctx.code_gen_buffer_size = size = 800u * 1024 * 1024;
>      }
>  # elif defined(__sparc__)
>      start = 0x40000000ul;
>  # elif defined(__s390x__)
>      start = 0x90000000ul;
>  # elif defined(__mips__)
> -    /* ??? We ought to more explicitly manage layout for softmmu too.  */
> -#  ifdef CONFIG_USER_ONLY
> -    start = 0x68000000ul;
> -#  elif _MIPS_SIM == _ABI64
> +#  if _MIPS_SIM == _ABI64
>      start = 0x128000000ul;
>  #  else
>      start = 0x08000000ul;
>  #  endif
>  # endif
>
> -    buf = mmap((void *)start, tcg_ctx.code_gen_buffer_size,
> -               PROT_WRITE | PROT_READ | PROT_EXEC, flags, -1, 0);
> +    buf = mmap((void *)start, size + qemu_real_host_page_size,
> +               PROT_NONE, flags, -1, 0);
>      if (buf == MAP_FAILED) {
>          return NULL;
>      }
>
>  #ifdef __mips__
> -    if (cross_256mb(buf, tcg_ctx.code_gen_buffer_size)) {
> +    if (cross_256mb(buf, size)) {
>          /* Try again, with the original still mapped, to avoid re-acquiring
>             that 256mb crossing.  This time don't specify an address.  */
> -        size_t size2, size1 = tcg_ctx.code_gen_buffer_size;
> -        void *buf2 = mmap(NULL, size1, PROT_WRITE | PROT_READ | PROT_EXEC,
> -                          flags, -1, 0);
> -        if (buf2 != MAP_FAILED) {
> -            if (!cross_256mb(buf2, size1)) {
> +        size_t size2;
> +        void *buf2 = mmap(NULL, size + qemu_real_host_page_size,
> +                          PROT_NONE, flags, -1, 0);
> +        switch (buf2 != MAP_FAILED) {
> +        case 1:
> +            if (!cross_256mb(buf2, size)) {
>                  /* Success!  Use the new buffer.  */
> -                munmap(buf, size1);
> -                return buf2;
> +                munmap(buf, size);
> +                break;
>              }
>              /* Failure.  Work with what we had.  */
> -            munmap(buf2, size1);
> +            munmap(buf2, size);
> +            /* fallthru */
> +        default:
> +            /* Split the original buffer.  Free the smaller half.  */
> +            buf2 = split_cross_256mb(buf, size);
> +            size2 = tcg_ctx.code_gen_buffer_size;
> +            if (buf == buf2) {
> +                munmap(buf + size2 + qemu_real_host_page_size, size - size2);
> +            } else {
> +                munmap(buf, size - size2);
> +            }
> +            size = size2;
> +            break;
>          }
> -
> -        /* Split the original buffer.  Free the smaller half.  */
> -        buf2 = split_cross_256mb(buf, size1);
> -        size2 = tcg_ctx.code_gen_buffer_size;
> -        munmap(buf + (buf == buf2 ? size2 : 0), size1 - size2);
> -        return buf2;
> +        buf = buf2;
>      }
>  #endif
>
> -    return buf;
> -}
> -#else
> -static inline void *alloc_code_gen_buffer(void)
> -{
> -    void *buf = g_try_malloc(tcg_ctx.code_gen_buffer_size);
> +    /* Make the final buffer accessable.  The guard page at the end
> +       will remain inaccessable with PROT_NONE.  */

"accessible"; "inaccessible".

> +    mprotect(buf, size, PROT_WRITE | PROT_READ | PROT_EXEC);
>
> -    if (buf == NULL) {
> -        return NULL;
> -    }
> +    /* Request large pages for the buffer.  */
> +    qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
>
> -#ifdef __mips__
> -    if (cross_256mb(buf, tcg_ctx.code_gen_buffer_size)) {
> -        void *buf2 = g_malloc(tcg_ctx.code_gen_buffer_size);
> -        if (buf2 != NULL && !cross_256mb(buf2, size1)) {
> -            /* Success!  Use the new buffer.  */
> -            free(buf);
> -            buf = buf2;
> -        } else {
> -            /* Failure.  Work with what we had.  Since this is malloc
> -               and not mmap, we can't free the other half.  */
> -            free(buf2);
> -            buf = split_cross_256mb(buf, tcg_ctx.code_gen_buffer_size);
> -        }
> -    }
> -#endif
> -
> -    map_exec(buf, tcg_ctx.code_gen_buffer_size);
>      return buf;
>  }
> -#endif /* USE_STATIC_CODE_GEN_BUFFER, USE_MMAP */
> +#endif /* USE_STATIC_CODE_GEN_BUFFER, WIN32, POSIX */
>
>  static inline void code_gen_alloc(size_t tb_size)
>  {
> @@ -688,9 +719,6 @@ static inline void code_gen_alloc(size_t tb_size)
>          exit(1);
>      }
>
> -    qemu_madvise(tcg_ctx.code_gen_buffer, tcg_ctx.code_gen_buffer_size,
> -                 QEMU_MADV_HUGEPAGE);
> -
>      /* Estimate a good size for the number of TBs we can support.  We
>         still haven't deducted the prologue from the buffer size here,
>         but that's minimal and won't affect the estimate much.  */
> @@ -708,8 +736,8 @@ static inline void code_gen_alloc(size_t tb_size)
>  void tcg_exec_init(unsigned long tb_size)
>  {
>      cpu_gen_init();
> -    code_gen_alloc(tb_size);
>      page_init();
> +    code_gen_alloc(tb_size);
>  #if defined(CONFIG_SOFTMMU)
>      /* There's no guest base to take into account, so go ahead and
>         initialize the prologue now.  */
> --
> 2.4.3
>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 23/25] tcg: Emit prologue to the beginning of code_gen_buffer
  2015-09-23 19:28   ` Peter Maydell
@ 2015-09-23 19:39     ` Richard Henderson
  0 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-23 19:39 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Alex Bennée, QEMU Developers, Aurelien Jarno

On 09/23/2015 12:28 PM, Peter Maydell wrote:
>> -    /* Steal room for the prologue at the end of the buffer.  This ensures
>> -       (via the MAX_CODE_GEN_BUFFER_SIZE limits above) that direct branches
>> -       from TB's to the prologue are going to be in range.  It also means
>> -       that we don't need to mark (additional) portions of the data segment
>> -       as executable.  */
> 
> I take it we don't have any annoying targets with branch-forwards
> ranges larger than their branch-backwards ranges...

No, thankfully.


r~

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 25/25] tcg: Check for overflow via highwater mark
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 25/25] tcg: Check for overflow via highwater mark Richard Henderson
@ 2015-09-23 19:42   ` Peter Maydell
  2015-09-23 20:01     ` Richard Henderson
  0 siblings, 1 reply; 53+ messages in thread
From: Peter Maydell @ 2015-09-23 19:42 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Alex Bennée, QEMU Developers, Aurelien Jarno

On 22 September 2015 at 13:25, Richard Henderson <rth@twiddle.net> wrote:
> We currently pre-compute an worst case code size for any TB, which
> works out to be 122kB.  Since the average TB size is near 1kB, this
> wastes quite a lot of storage.
>
> Instead, check for overflow in between generating code for each opcode.
> The overhead of the check isn't measurable and wastage is minimized.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  include/exec/exec-all.h |  6 ------
>  tcg/tcg.c               | 16 ++++++++++++----
>  tcg/tcg.h               |  6 ++++--
>  translate-all.c         | 31 ++++++++++++++++++++++++++-----
>  4 files changed, 42 insertions(+), 17 deletions(-)
>
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 6871e78..71c9d85 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -62,12 +62,6 @@ typedef struct TranslationBlock TranslationBlock;
>  #define OPC_BUF_SIZE 640
>  #define OPC_MAX_SIZE (OPC_BUF_SIZE - MAX_OP_PER_INSTR)
>
> -/* Maximum size a TCG op can expand to.  This is complicated because a
> -   single op may require several host instructions and register reloads.
> -   For now take a wild guess at 192 bytes, which should allow at least
> -   a couple of fixup instructions per argument.  */
> -#define TCG_MAX_OP_SIZE 192
> -
>  #define OPPARAM_BUF_SIZE (OPC_BUF_SIZE * MAX_OPC_PARAM)
>
>  #include "qemu/log.h"
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index db4032a..750b977 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -375,11 +375,12 @@ void tcg_prologue_init(TCGContext *s)
>      /* Deduct the prologue from the buffer.  */
>      prologue_size = tcg_current_code_size(s);
>      s->code_gen_ptr = s->code_gen_buffer = s->code_buf = s->code_ptr;
> -
> -    /* Compute a high-water mark, at which we voluntarily flush the
> -       buffer and start over.  */
>      total_size = s->code_gen_buffer_size -= prologue_size;
> -    s->code_gen_buffer_max_size = total_size - TCG_MAX_OP_SIZE * OPC_BUF_SIZE;
> +
> +    /* Compute a high-water mark, at which we voluntarily flush the buffer
> +       and start over.  The size here is arbitrary, significantly larger
> +       than we expect the code generation for any one opcode to require.  */
> +    s->code_gen_highwater = s->code_gen_buffer + (total_size - 1024);
>
>      tcg_register_jit(s->code_gen_buffer, total_size);
>
> @@ -2430,6 +2431,13 @@ int tcg_gen_code(TCGContext *s, tcg_insn_unit *gen_code_buf)
>  #ifndef NDEBUG
>          check_regs(s);
>  #endif
> +        /* Test for (pending) buffer overflow.  The assumption is that any
> +           one operation beginning below the high water mark cannot overrun
> +           the buffer completely.  Thus we can test for overfow after
> +           generating code without having to check during generation.  */

"overflow"


> +        if (unlikely(s->code_gen_ptr > s->code_gen_highwater)) {
> +            return -1;
> +        }
>      }
>      tcg_debug_assert(num_insns >= 0);
>      s->gen_insn_end_off[num_insns] = tcg_current_code_size(s);
> diff --git a/tcg/tcg.h b/tcg/tcg.h
> index 5fbbd15..be95b98 100644
> --- a/tcg/tcg.h
> +++ b/tcg/tcg.h
> @@ -559,10 +559,12 @@ struct TCGContext {
>      void *code_gen_prologue;
>      void *code_gen_buffer;
>      size_t code_gen_buffer_size;
> -    /* threshold to flush the translated code buffer */
> -    size_t code_gen_buffer_max_size;
>      void *code_gen_ptr;
>
> +    /* Threshold to flush the translated code buffer, and where to go
> +       upon overflow.  */
> +    void *code_gen_highwater;

I don't understand what the "and where to go upon overflow" part
of this comment means. Can you elaborate?

> +
>      TBContext tb_ctx;
>
>      /* The TCGBackendData structure is private to tcg-target.c.  */
> diff --git a/translate-all.c b/translate-all.c
> index 0049927..5ad0a61 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -222,6 +222,7 @@ static target_long decode_sleb128(uint8_t **pp)
>
>  static int encode_search(TranslationBlock *tb, uint8_t *block)
>  {
> +    uint8_t *highwater = tcg_ctx.code_gen_highwater;
>      uint8_t *p = block;
>      int i, j, n;
>
> @@ -240,6 +241,14 @@ static int encode_search(TranslationBlock *tb, uint8_t *block)
>          }
>          prev = (i == 0 ? 0 : tcg_ctx.gen_insn_end_off[i - 1]);
>          p = encode_sleb128(p, tcg_ctx.gen_insn_end_off[i] - prev);
> +
> +        /* Test for (pending) buffer overflow.  The assumption is that any
> +           one row beginning below the high water mark cannot overrun
> +           the buffer completely.  Thus we can test for overfow after
> +           encoding a row without having to check during encoding.  */

"overflow"

> +        if (unlikely(p > highwater)) {
> +            return -1;
> +        }
>      }
>
>      return p - block;
> @@ -756,9 +765,7 @@ static TranslationBlock *tb_alloc(target_ulong pc)
>  {
>      TranslationBlock *tb;
>
> -    if (tcg_ctx.tb_ctx.nb_tbs >= tcg_ctx.code_gen_max_blocks ||
> -        (tcg_ctx.code_gen_ptr - tcg_ctx.code_gen_buffer) >=
> -         tcg_ctx.code_gen_buffer_max_size) {
> +    if (tcg_ctx.tb_ctx.nb_tbs >= tcg_ctx.code_gen_max_blocks) {
>          return NULL;
>      }
>      tb = &tcg_ctx.tb_ctx.tbs[tcg_ctx.tb_ctx.nb_tbs++];
> @@ -1063,12 +1070,15 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>      if (use_icount) {
>          cflags |= CF_USE_ICOUNT;
>      }
> +
>      tb = tb_alloc(pc);
> -    if (!tb) {
> +    if (unlikely(!tb)) {
> + buffer_overflow:
>          /* flush must be done */
>          tb_flush(cpu);
>          /* cannot fail at this point */
>          tb = tb_alloc(pc);
> +        assert(tb != NULL);
>          /* Don't forget to invalidate previous TB info.  */
>          tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
>      }
> @@ -1109,8 +1119,19 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>      tcg_ctx.code_time -= profile_getclock();
>  #endif
>
> +    /* ??? Overflow could be handled better here.  In particular, we
> +       don't need to re-do gen_intermediate_code, nor should we re-do
> +       the tcg optimization currently hidden inside tcg_gen_code.  All
> +       that should be required is to flush the TBs, allocate a new TB,
> +       re-initialize it per above, and re-do the actual code generation.  */
>      gen_code_size = tcg_gen_code(&tcg_ctx, gen_code_buf);
> +    if (unlikely(gen_code_size < 0)) {
> +        goto buffer_overflow;
> +    }
>      search_size = encode_search(tb, (void *)gen_code_buf + gen_code_size);
> +    if (unlikely(search_size < 0)) {
> +        goto buffer_overflow;
> +    }
>
>  #ifdef CONFIG_PROFILER
>      tcg_ctx.code_time += profile_getclock();
> @@ -1681,7 +1702,7 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf)
>      cpu_fprintf(f, "Translation buffer state:\n");
>      cpu_fprintf(f, "gen code size       %td/%zd\n",
>                  tcg_ctx.code_gen_ptr - tcg_ctx.code_gen_buffer,
> -                tcg_ctx.code_gen_buffer_max_size);
> +                tcg_ctx.code_gen_highwater - tcg_ctx.code_gen_buffer);
>      cpu_fprintf(f, "TB count            %d/%d\n",
>              tcg_ctx.tb_ctx.nb_tbs, tcg_ctx.code_gen_max_blocks);
>      cpu_fprintf(f, "TB avg target size  %d max=%d bytes\n",
> --
> 2.4.3
>

Otherwise
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 24/25] tcg: Allocate a guard page after code_gen_buffer
  2015-09-23 19:39   ` Peter Maydell
@ 2015-09-23 20:00     ` Richard Henderson
  2015-09-23 20:37       ` Peter Maydell
  0 siblings, 1 reply; 53+ messages in thread
From: Richard Henderson @ 2015-09-23 20:00 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Alex Bennée, QEMU Developers, Aurelien Jarno

On 09/23/2015 12:39 PM, Peter Maydell wrote:
>> +# ifdef _WIN32
> 
> Why the space before ifdef here ?

#ifdef USE_STATIC_CODE_GEN_BUFFER
# ifdef _WIN32
# else
# endif /* WIN32 */
#elif defined(_WIN32)
#else
#endif

It's something that glibc requires for its coding style, and I find myself
using it most of the time.

>> +static inline void do_protect(void *addr, long size, int prot)
>> +{
>> +    DWORD old_protect;
>> +    VirtualProtect(addr, size, PAGE_EXECUTE_READWRITE, &old_protect);
> 
> The 'prot' argument isn't used -- did you mean to pass it
> in as VirtualProtect argument 3 ?

Oops, yes.

>>  static inline void *alloc_code_gen_buffer(void)
>>  {
>>      void *buf = static_code_gen_buffer;
>> +    size_t full_size, size;
>> +
>> +    /* The size of the buffer, rounded down to end on a page boundary.  */
>> +    full_size = (((uintptr_t)buf + sizeof(static_code_gen_buffer))
>> +                 & qemu_real_host_page_mask) - (uintptr_t)buf;
>> +
>> +    /* Reserve a guard page.  */
>> +    size = full_size - qemu_real_host_page_size;
>> +
>> +    /* Honor a command-line option limiting the size of the buffer.  */
>> +    if (size > tcg_ctx.code_gen_buffer_size) {
>> +        size = (((uintptr_t)buf + tcg_ctx.code_gen_buffer_size)
>> +                & qemu_real_host_page_mask) - (uintptr_t)buf;
>> +    }
>> +    tcg_ctx.code_gen_buffer_size = size;
>> +
>>  #ifdef __mips__
>> -    if (cross_256mb(buf, tcg_ctx.code_gen_buffer_size)) {
>> -        buf = split_cross_256mb(buf, tcg_ctx.code_gen_buffer_size);
>> +    if (cross_256mb(buf, size)) {
>> +        buf = split_cross_256mb(buf, size);
>> +        size = tcg_ctx.code_gen_buffer_size;
>>      }
>>  #endif
>> -    map_exec(buf, tcg_ctx.code_gen_buffer_size);
>> +
>> +    map_exec(buf, size);
>> +    map_none(buf + size, qemu_real_host_page_size);
>> +    qemu_madvise(buf, size, QEMU_MADV_HUGEPAGE);
> 
> I think we're now doing the MADV_HUGEPAGE over "buffer size
> minus a page" rather than "buffer size". Does that mean
> we've gone from doing the madvise on a whole number of
> hugepages to doing it on something that's not a whole number
> of hugepages, and if so does the kernel decide not to use
> hugepages here?

On the whole I don't think it matters.  The static buffer isn't page aligned to
begin with, much less hugepage aligned, so the fact that we're allocating a
round number like 32mb here doesn't really mean much.  The beginning and/or end
pages of the buffer definitely aren't going to be hugepage.

Worse, the same is true for the mmap path, since I've never seen the kernel
select a hugepage aligned address.  You'd think that adding MAP_HUGEPAGE would
be akin to MADV_HUGEPAGE, with the additional hint that the address should be
appropriately aligned for the hugepage, but no.  It implies forced use of
something from the hugepage pool and that requires extra suid capabilities.

I've wondered about over-allocating on the mmap path, so that we can choose the
hugepage aligned subregion.  But as far as I can tell, my kernel doesn't
allocate hugepages at all, no matter what we do.  So it seems a little silly to
go so far out of the way to get an aligned buffer.


r~

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 25/25] tcg: Check for overflow via highwater mark
  2015-09-23 19:42   ` Peter Maydell
@ 2015-09-23 20:01     ` Richard Henderson
  0 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-23 20:01 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Alex Bennée, QEMU Developers, Aurelien Jarno

On 09/23/2015 12:42 PM, Peter Maydell wrote:
>> +    /* Threshold to flush the translated code buffer, and where to go
>> +       upon overflow.  */
>> +    void *code_gen_highwater;
> 
> I don't understand what the "and where to go upon overflow" part
> of this comment means. Can you elaborate?

Heh.  Comment written when there was a jmp_buf there too.


r~

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 24/25] tcg: Allocate a guard page after code_gen_buffer
  2015-09-23 20:00     ` Richard Henderson
@ 2015-09-23 20:37       ` Peter Maydell
  2015-09-23 22:12         ` Richard Henderson
  0 siblings, 1 reply; 53+ messages in thread
From: Peter Maydell @ 2015-09-23 20:37 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Alex Bennée, QEMU Developers, Aurelien Jarno

On 23 September 2015 at 13:00, Richard Henderson <rth@twiddle.net> wrote:
> On 09/23/2015 12:39 PM, Peter Maydell wrote:
>> I think we're now doing the MADV_HUGEPAGE over "buffer size
>> minus a page" rather than "buffer size". Does that mean
>> we've gone from doing the madvise on a whole number of
>> hugepages to doing it on something that's not a whole number
>> of hugepages, and if so does the kernel decide not to use
>> hugepages here?
>
> On the whole I don't think it matters.  The static buffer isn't page aligned to
> begin with, much less hugepage aligned, so the fact that we're allocating a
> round number like 32mb here doesn't really mean much.  The beginning and/or end
> pages of the buffer definitely aren't going to be hugepage.
>
> Worse, the same is true for the mmap path, since I've never seen the kernel
> select a hugepage aligned address.  You'd think that adding MAP_HUGEPAGE would
> be akin to MADV_HUGEPAGE, with the additional hint that the address should be
> appropriately aligned for the hugepage, but no.  It implies forced use of
> something from the hugepage pool and that requires extra suid capabilities.
>
> I've wondered about over-allocating on the mmap path, so that we can choose the
> hugepage aligned subregion.  But as far as I can tell, my kernel doesn't
> allocate hugepages at all, no matter what we do.  So it seems a little silly to
> go so far out of the way to get an aligned buffer.

This raises the converse question of "why are we bothering with
MADV_HUGEPAGE at all?" :-)

-- PMM

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 24/25] tcg: Allocate a guard page after code_gen_buffer
  2015-09-23 20:37       ` Peter Maydell
@ 2015-09-23 22:12         ` Richard Henderson
  0 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-23 22:12 UTC (permalink / raw)
  To: Peter Maydell; +Cc: Alex Bennée, QEMU Developers, Aurelien Jarno

On 09/23/2015 01:37 PM, Peter Maydell wrote:
> On 23 September 2015 at 13:00, Richard Henderson <rth@twiddle.net> wrote:
>> I've wondered about over-allocating on the mmap path, so that we can choose the
>> hugepage aligned subregion.  But as far as I can tell, my kernel doesn't
>> allocate hugepages at all, no matter what we do.  So it seems a little silly to
>> go so far out of the way to get an aligned buffer.
>
> This raises the converse question of "why are we bothering with
> MADV_HUGEPAGE at all?" :-)

I beg your pardon -- I was merely looking in the wrong place for the info. 
/proc/pid/smap does show that nearly all of the area is using huge pages:

Main memory:
7fc130000000-7fc1b0000000 rw-p 00000000 00:00 0
Size:            2097152 kB
Rss:               88064 kB
Pss:               88064 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:     88064 kB
Referenced:        88064 kB
Anonymous:         88064 kB
AnonHugePages:     88064 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Locked:                0 kB

code_gen_buffer:
7fc1d76e6000-7fc1f76e6000 rwxp 00000000 00:00 0
Size:             524288 kB
Rss:               58472 kB
Pss:               58472 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:     58472 kB
Referenced:        58472 kB
Anonymous:         58472 kB
AnonHugePages:     57344 kB
Swap:                  0 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Locked:                0 kB


r~

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 12/25] target-sparc: Tidy gen_branch_a interface
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 12/25] target-sparc: Tidy gen_branch_a interface Richard Henderson
  2015-09-22 21:23   ` Aurelien Jarno
@ 2015-09-24 19:42   ` Aurelien Jarno
  1 sibling, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2015-09-24 19:42 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, alex.bennee, qemu-devel

On 2015-09-22 13:24, Richard Henderson wrote:
> We always pass pc2 == dc->npc and r_cond == cpu_cond,
> and always set is_br afterward.  Infer all of that.
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-sparc/translate.c | 21 ++++++++++-----------
>  1 file changed, 10 insertions(+), 11 deletions(-)
> 

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 13/25] target-sparc: Split out gen_branch_n
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 13/25] target-sparc: Split out gen_branch_n Richard Henderson
@ 2015-09-24 19:42   ` Aurelien Jarno
  0 siblings, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2015-09-24 19:42 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, alex.bennee, qemu-devel

On 2015-09-22 13:24, Richard Henderson wrote:
> Unify three copies of this code from different
> branch types.  Fix the case when npc == DYNAMIC_PC,
> i.e. a branch within a delay slot.
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-sparc/translate.c | 55 ++++++++++++++++++++++++------------------------
>  1 file changed, 28 insertions(+), 27 deletions(-)

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 14/25] target-sparc: Remove gen_opc_jump_pc
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 14/25] target-sparc: Remove gen_opc_jump_pc Richard Henderson
@ 2015-09-24 19:42   ` Aurelien Jarno
  0 siblings, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2015-09-24 19:42 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, alex.bennee, qemu-devel

On 2015-09-22 13:24, Richard Henderson wrote:
> Since jump_pc[1] is always npc + 4, we can infer after incrementing
> that jump_pc[1] == pc + 4.  Because of that, we can encode the branch
> destination into a single word, and store that in npc.
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-sparc/translate.c | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 15/25] target-sparc: Add npc state to insn_start
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 15/25] target-sparc: Add npc state to insn_start Richard Henderson
@ 2015-09-24 19:42   ` Aurelien Jarno
  0 siblings, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2015-09-24 19:42 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, alex.bennee, qemu-devel

On 2015-09-22 13:24, Richard Henderson wrote:
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-sparc/cpu.h       | 1 +
>  target-sparc/translate.c | 7 ++++++-
>  2 files changed, 7 insertions(+), 1 deletion(-)

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> 

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 16/25] tcg: Merge cpu_gen_code into tb_gen_code
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 16/25] tcg: Merge cpu_gen_code into tb_gen_code Richard Henderson
@ 2015-09-24 19:48   ` Aurelien Jarno
  0 siblings, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2015-09-24 19:48 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, alex.bennee, qemu-devel

On 2015-09-22 13:24, Richard Henderson wrote:
> As it's only caller, this tidies things a bit.
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  include/exec/exec-all.h |   2 -
>  translate-all.c         | 131 ++++++++++++++++++++++--------------------------
>  2 files changed, 59 insertions(+), 74 deletions(-)

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 17/25] target-*: Drop cpu_gen_code define
  2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 17/25] target-*: Drop cpu_gen_code define Richard Henderson
@ 2015-09-24 19:49   ` Aurelien Jarno
  0 siblings, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2015-09-24 19:49 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, alex.bennee, qemu-devel

On 2015-09-22 13:24, Richard Henderson wrote:
> This symbol no longer exists.
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-alpha/cpu.h      | 1 -
>  target-arm/cpu.h        | 1 -
>  target-cris/cpu.h       | 1 -
>  target-i386/cpu.h       | 1 -
>  target-lm32/cpu.h       | 1 -
>  target-m68k/cpu.h       | 1 -
>  target-microblaze/cpu.h | 1 -
>  target-mips/cpu.h       | 1 -
>  target-moxie/cpu.h      | 1 -
>  target-openrisc/cpu.h   | 1 -
>  target-ppc/cpu.h        | 1 -
>  target-s390x/cpu.h      | 1 -
>  target-sh4/cpu.h        | 1 -
>  target-sparc/cpu.h      | 1 -
>  target-tilegx/cpu.h     | 1 -
>  target-xtensa/cpu.h     | 1 -
>  16 files changed, 16 deletions(-)

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 18/25] tcg: Add TCG_MAX_INSNS
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 18/25] tcg: Add TCG_MAX_INSNS Richard Henderson
@ 2015-09-24 20:02   ` Aurelien Jarno
  2015-09-24 20:43     ` Richard Henderson
  0 siblings, 1 reply; 53+ messages in thread
From: Aurelien Jarno @ 2015-09-24 20:02 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, alex.bennee, qemu-devel

On 2015-09-22 13:25, Richard Henderson wrote:
> Adjust all translators to respect it.
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  target-alpha/translate.c      |  3 +++
>  target-arm/translate-a64.c    |  3 +++
>  target-arm/translate.c        |  6 +++++-
>  target-cris/translate.c       |  3 +++
>  target-i386/translate.c       |  6 +++++-
>  target-lm32/translate.c       |  3 +++
>  target-m68k/translate.c       |  6 +++++-
>  target-microblaze/translate.c |  6 +++++-
>  target-mips/translate.c       |  7 ++++++-
>  target-moxie/translate.c      | 13 +++++++++++--
>  target-openrisc/translate.c   |  3 +++
>  target-ppc/translate.c        |  6 +++++-
>  target-s390x/translate.c      |  3 +++
>  target-sh4/translate.c        |  7 ++++++-
>  target-sparc/translate.c      |  7 ++++++-
>  target-tilegx/translate.c     |  3 +++
>  target-tricore/translate.c    | 20 +++++++++++++-------
>  target-unicore32/translate.c  |  3 +++
>  target-xtensa/translate.c     |  3 +++
>  tcg/tcg.h                     |  1 +
>  20 files changed, 95 insertions(+), 17 deletions(-)
> 
> diff --git a/target-alpha/translate.c b/target-alpha/translate.c
> index c10193e..538e202 100644
> --- a/target-alpha/translate.c
> +++ b/target-alpha/translate.c
> @@ -2903,6 +2903,9 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
>      if (max_insns == 0) {
>          max_insns = CF_COUNT_MASK;
>      }

I guess you can change also change the value to TCG_MAX_INSNS, though I
guess the compiler will realize about that.

> +    if (max_insns > TCG_MAX_INSNS) {
> +        max_insns = TCG_MAX_INSNS;
> +    }
>  
>      if (in_superpage(&ctx, pc_start)) {
>          pc_mask = (1ULL << 41) - 1;

Given we have the same pattern in all targets, I do wonder if it
wouldn't be better to just setup (cflags & CF_COUNT_MASK) to
TCG_MAX_INSNS instead of 0 in translate-all.c when not using icount.

That said your code is correct, so:

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 19/25] tcg: Pass data argument to restore_state_to_opc
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 19/25] tcg: Pass data argument to restore_state_to_opc Richard Henderson
@ 2015-09-24 20:11   ` Aurelien Jarno
  0 siblings, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2015-09-24 20:11 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, alex.bennee, qemu-devel

On 2015-09-22 13:25, Richard Henderson wrote:
> The gen_opc_* arrays are already redundant with the data stored in
> the insn_start arguments.  Transition restore_state_to_opc to use
> data from the latter.
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  include/exec/exec-all.h       |  2 +-
>  target-alpha/translate.c      |  5 +++--
>  target-arm/translate.c        |  9 +++++----
>  target-cris/translate.c       |  5 +++--
>  target-i386/translate.c       | 26 ++++++--------------------
>  target-lm32/translate.c       |  5 +++--
>  target-m68k/translate.c       |  5 +++--
>  target-microblaze/translate.c |  5 +++--
>  target-mips/translate.c       |  9 +++++----
>  target-moxie/translate.c      |  5 +++--
>  target-openrisc/translate.c   |  4 ++--
>  target-ppc/translate.c        |  5 +++--
>  target-s390x/translate.c      |  8 ++++----
>  target-sh4/translate.c        |  7 ++++---
>  target-sparc/translate.c      | 10 ++++++----
>  target-tilegx/translate.c     |  5 +++--
>  target-tricore/translate.c    |  5 +++--
>  target-unicore32/translate.c  |  5 +++--
>  target-xtensa/translate.c     |  5 +++--
>  tcg/tcg.c                     | 11 ++++++++++-
>  tcg/tcg.h                     |  2 ++
>  translate-all.c               |  2 +-
>  22 files changed, 79 insertions(+), 66 deletions(-)

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 18/25] tcg: Add TCG_MAX_INSNS
  2015-09-24 20:02   ` Aurelien Jarno
@ 2015-09-24 20:43     ` Richard Henderson
  0 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-24 20:43 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: peter.maydell, alex.bennee, qemu-devel

On 09/24/2015 01:02 PM, Aurelien Jarno wrote:
>> @@ -2903,6 +2903,9 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
>>      if (max_insns == 0) {
>>          max_insns = CF_COUNT_MASK;
>>      }
> 
> I guess you can change also change the value to TCG_MAX_INSNS, though I
> guess the compiler will realize about that.

I did wonder about the best thing to do re CF_COUNT_MASK.  Especially as it's
currently set to 0x7fff.  FWIW, the largest TB I've seen so far while
collecting statistics is 157 insns.  So the current setting of TCG_MAX_INSNS at
512 is more than enough.

> 
>> +    if (max_insns > TCG_MAX_INSNS) {
>> +        max_insns = TCG_MAX_INSNS;
>> +    }
>>  
>>      if (in_superpage(&ctx, pc_start)) {
>>          pc_mask = (1ULL << 41) - 1;
> 
> Given we have the same pattern in all targets, I do wonder if it
> wouldn't be better to just setup (cflags & CF_COUNT_MASK) to
> TCG_MAX_INSNS instead of 0 in translate-all.c when not using icount.

Yes, that would probably be best.

There should probably be some helper function that handles all these as well as
noticing single-stepping.  Too many targets test

   (num_insns >= max_insns || singlestep || ...)

when we could just as well set max_insns to 1 and have just the one runtime
test.  Then there's all the targets which have a fixed insn size, where we can
pre-compute the number of insns left on the page, and fold in the end-of-page
test as well.

I'll put cleaning this up on the to-do list.


r~

> 
> That said your code is correct, so:
> 
> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 20/25] tcg: Save insn data and use it in cpu_restore_state_from_tb
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 20/25] tcg: Save insn data and use it in cpu_restore_state_from_tb Richard Henderson
  2015-09-23 19:20   ` Peter Maydell
@ 2015-09-25 21:10   ` Aurelien Jarno
  2015-09-25 23:05     ` Richard Henderson
  1 sibling, 1 reply; 53+ messages in thread
From: Aurelien Jarno @ 2015-09-25 21:10 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, alex.bennee, qemu-devel

On 2015-09-22 13:25, Richard Henderson wrote:
> We can now restore state without retranslation.
> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  include/exec/exec-all.h |   1 +
>  tcg/tcg.c               |  40 ++++++++-----
>  tcg/tcg.h               |   4 +-
>  translate-all.c         | 149 +++++++++++++++++++++++++++++++++++-------------
>  4 files changed, 139 insertions(+), 55 deletions(-)
> 
> diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
> index 6a69802..402dd87 100644
> --- a/include/exec/exec-all.h
> +++ b/include/exec/exec-all.h
> @@ -199,6 +199,7 @@ struct TranslationBlock {
>  #define CF_USE_ICOUNT  0x20000
>  
>      void *tc_ptr;    /* pointer to the translated code */
> +    uint8_t *tc_search;  /* pointer to search data */
>      /* next matching tb for physical address. */
>      struct TranslationBlock *phys_hash_next;
>      /* original tb when cflags has CF_NOCACHE */
> diff --git a/tcg/tcg.c b/tcg/tcg.c
> index bdb83d9..a0fce5b 100644
> --- a/tcg/tcg.c
> +++ b/tcg/tcg.c
> @@ -2294,7 +2294,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
>                                        tcg_insn_unit *gen_code_buf,
>                                        long search_pc)
>  {
> -    int i, oi, oi_next;
> +    int i, oi, oi_next, num_insns;
>  
>  #ifdef DEBUG_DISAS
>      if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP))) {
> @@ -2338,6 +2338,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
>  
>      tcg_out_tb_init(s);
>  
> +    num_insns = -1;
>      for (oi = s->gen_first_op_idx; oi >= 0; oi = oi_next) {
>          TCGOp * const op = &s->gen_op_buf[oi];
>          TCGArg * const args = &s->gen_opparam_buf[op->args];
> @@ -2361,6 +2362,10 @@ static inline int tcg_gen_code_common(TCGContext *s,
>              tcg_reg_alloc_movi(s, args, dead_args, sync_args);
>              break;
>          case INDEX_op_insn_start:
> +            if (num_insns >= 0) {
> +                s->gen_insn_end_off[num_insns] = tcg_current_code_size(s);
> +            }
> +            num_insns++;
>              for (i = 0; i < TARGET_INSN_START_WORDS; ++i) {
>                  target_ulong a;
>  #if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
> @@ -2368,7 +2373,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
>  #else
>                  a = args[i];
>  #endif
> -                s->gen_opc_data[i] = a;
> +                s->gen_insn_data[num_insns][i] = a;
>              }
>              break;
>          case INDEX_op_discard:
> @@ -2400,6 +2405,8 @@ static inline int tcg_gen_code_common(TCGContext *s,
>          check_regs(s);
>  #endif
>      }
> +    tcg_debug_assert(num_insns >= 0);
> +    s->gen_insn_end_off[num_insns] = tcg_current_code_size(s);
>  
>      /* Generate TB finalization at the end of block */
>      tcg_out_tb_finalize(s);
> @@ -2448,24 +2455,26 @@ int tcg_gen_code_search_pc(TCGContext *s, tcg_insn_unit *gen_code_buf,
>  void tcg_dump_info(FILE *f, fprintf_function cpu_fprintf)
>  {
>      TCGContext *s = &tcg_ctx;
> -    int64_t tot;
> +    int64_t tb_count = s->tb_count;
> +    int64_t tb_div_count = tb_count ? tb_count : 1;
> +    int64_t tot = s->interm_time + s->code_time;
>  
> -    tot = s->interm_time + s->code_time;
>      cpu_fprintf(f, "JIT cycles          %" PRId64 " (%0.3f s at 2.4 GHz)\n",
>                  tot, tot / 2.4e9);
>      cpu_fprintf(f, "translated TBs      %" PRId64 " (aborted=%" PRId64 " %0.1f%%)\n", 
> -                s->tb_count, 
> -                s->tb_count1 - s->tb_count,
> -                s->tb_count1 ? (double)(s->tb_count1 - s->tb_count) / s->tb_count1 * 100.0 : 0);
> +                tb_count, s->tb_count1 - tb_count,
> +                (double)(s->tb_count1 - s->tb_count)
> +                / (s->tb_count1 ? s->tb_count1 : 1) * 100.0);
>      cpu_fprintf(f, "avg ops/TB          %0.1f max=%d\n", 
> -                s->tb_count ? (double)s->op_count / s->tb_count : 0, s->op_count_max);
> +                (double)s->op_count / tb_div_count, s->op_count_max);
>      cpu_fprintf(f, "deleted ops/TB      %0.2f\n",
> -                s->tb_count ? 
> -                (double)s->del_op_count / s->tb_count : 0);
> +                (double)s->del_op_count / tb_div_count);
>      cpu_fprintf(f, "avg temps/TB        %0.2f max=%d\n",
> -                s->tb_count ? 
> -                (double)s->temp_count / s->tb_count : 0,
> -                s->temp_count_max);
> +                (double)s->temp_count / tb_div_count, s->temp_count_max);
> +    cpu_fprintf(f, "avg host code/TB    %0.1f\n",
> +                (double)s->code_out_len / tb_div_count);
> +    cpu_fprintf(f, "avg search data/TB  %0.1f\n",
> +                (double)s->search_out_len / tb_div_count);
>      
>      cpu_fprintf(f, "cycles/op           %0.1f\n", 
>                  s->op_count ? (double)tot / s->op_count : 0);
> @@ -2473,8 +2482,11 @@ void tcg_dump_info(FILE *f, fprintf_function cpu_fprintf)
>                  s->code_in_len ? (double)tot / s->code_in_len : 0);
>      cpu_fprintf(f, "cycles/out byte     %0.1f\n", 
>                  s->code_out_len ? (double)tot / s->code_out_len : 0);
> -    if (tot == 0)
> +    cpu_fprintf(f, "cycles/search byte     %0.1f\n", 
> +                s->search_out_len ? (double)tot / s->search_out_len : 0);
> +    if (tot == 0) {
>          tot = 1;
> +    }
>      cpu_fprintf(f, "  gen_interm time   %0.1f%%\n", 
>                  (double)s->interm_time / tot * 100.0);
>      cpu_fprintf(f, "  gen_code time     %0.1f%%\n", 
> diff --git a/tcg/tcg.h b/tcg/tcg.h
> index 8fd1252..df499c6 100644
> --- a/tcg/tcg.h
> +++ b/tcg/tcg.h
> @@ -532,6 +532,7 @@ struct TCGContext {
>      int64_t del_op_count;
>      int64_t code_in_len;
>      int64_t code_out_len;
> +    int64_t search_out_len;
>      int64_t interm_time;
>      int64_t code_time;
>      int64_t la_time;
> @@ -581,7 +582,8 @@ struct TCGContext {
>      uint16_t gen_opc_icount[OPC_BUF_SIZE];
>      uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
>  
> -    target_ulong gen_opc_data[TARGET_INSN_START_WORDS];
> +    uint16_t gen_insn_end_off[TCG_MAX_INSNS];
> +    target_ulong gen_insn_data[TCG_MAX_INSNS][TARGET_INSN_START_WORDS];
>  };
>  
>  extern TCGContext tcg_ctx;
> diff --git a/translate-all.c b/translate-all.c
> index 9f801ae..f6b8148 100644
> --- a/translate-all.c
> +++ b/translate-all.c
> @@ -168,61 +168,127 @@ void cpu_gen_init(void)
>      tcg_context_init(&tcg_ctx); 
>  }
>  
> +/* Encode VAL as a signed leb128 sequence at P.
> +   Return P incremented past the encoded value.  */
> +static uint8_t *encode_sleb128(uint8_t *p, target_long val)
> +{
> +    int more, byte;
> +
> +    do {
> +        byte = val & 0x7f;
> +        val >>= 7;
> +        more = !((val == 0 && (byte & 0x40) == 0)
> +                 || (val == -1 && (byte & 0x40) != 0));
> +        if (more)
> +          byte |= 0x80;

You are missing braces here.

> +        *p++ = byte;
> +    } while (more);
> +
> +    return p;
> +}
> +
> +/* Decode a signed leb128 sequence at *PP; increment *PP past the
> +   decoded value.  Return the decoded value.  */
> +static target_long decode_sleb128(uint8_t **pp)
> +{
> +    uint8_t *p = *pp;
> +    target_long val = 0;
> +    int byte, shift = 0;
> +
> +    do {
> +        byte = *p++;
> +        val |= (target_ulong)(byte & 0x7f) << shift;
> +        shift += 7;
> +    } while (byte & 0x80);
> +    if (shift < TARGET_LONG_BITS && (byte & 0x40)) {
> +        val |= -(target_ulong)1 << shift;
> +    }
> +
> +    *pp = p;
> +    return val;
> +}
> +
> +/* Encode the data collected about the instructions while compiling TB.
> +   Place the data at BLOCK, and return the number of bytes consumed.
> +
> +   The logical table consisits of TARGET_INSN_START_WORDS target_ulong's,
> +   which come from the target's insn_start data, followed by a uintptr_t
> +   which comes from the host pc of the end of the code implementing the insn.
> +
> +   Each line of the table is encoded as sleb128 deltas from the previous
> +   line.  The seed for the first line is { tb->pc, 0..., tb->tc_ptr }.
> +   That is, the first column is seeded with the guest pc, the last column
> +   with the host pc, and the middle columns with zeros.  */
> +
> +static int encode_search(TranslationBlock *tb, uint8_t *block)
> +{
> +    uint8_t *p = block;
> +    int i, j, n;
> +
> +    tb->tc_search = block;
> +
> +    for (i = 0, n = tb->icount; i < n; ++i) {
> +        target_ulong prev;
> +
> +        for (j = 0; j < TARGET_INSN_START_WORDS; ++j) {
> +            if (i == 0) {
> +                prev = (j == 0 ? tb->pc : 0);
> +            } else {
> +                prev = tcg_ctx.gen_insn_data[i - 1][j];
> +            }
> +            p = encode_sleb128(p, tcg_ctx.gen_insn_data[i][j] - prev);
> +        }
> +        prev = (i == 0 ? 0 : tcg_ctx.gen_insn_end_off[i - 1]);
> +        p = encode_sleb128(p, tcg_ctx.gen_insn_end_off[i] - prev);
> +    }
> +
> +    return p - block;
> +}
> +

Given we save both the host and the guest PC in this structure, one
obvious optimization would be to skip saving data for host instructions
which can not generate exception. It means that all the TCG ops in this
instruction do not generate exceptions either. We can easily test that
for all TCG instructions except all by looking at the
TCG_OPF_SIDE_EFFECTS flag. For the call op, we have to look at the
TCG_CALL_NO_SIDE_EFFECTS flag, even if it doesn't necessary means the
helper might generate exception.

That should significantly save space on load/store architectures. That
said we can probably do that in a latter time.

>  /* The cpu state corresponding to 'searched_pc' is restored.  */
>  static int cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb,
>                                       uintptr_t searched_pc)
>  {
> +    target_ulong data[TARGET_INSN_START_WORDS] = { tb->pc };
> +    uintptr_t host_pc = (uintptr_t)tb->tc_ptr;
>      CPUArchState *env = cpu->env_ptr;
> -    TCGContext *s = &tcg_ctx;
> -    int j;
> -    uintptr_t tc_ptr;
> +    uint8_t *p = tb->tc_search;
> +    int i, j, num_insns = tb->icount;
>  #ifdef CONFIG_PROFILER
> -    int64_t ti;
> +    int64_t ti = profile_getclock();
>  #endif
>  
> -#ifdef CONFIG_PROFILER
> -    ti = profile_getclock();
> -#endif
> -    tcg_func_start(s);
> +    if (searched_pc < host_pc) {
> +        return -1;
> +    }
>  
> -    gen_intermediate_code_pc(env, tb);
> +    /* Reconstruct the stored insn data while looking for the point at
> +       which the end of the insn exceeds the searched_pc.  */
> +    for (i = 0; i < num_insns; ++i) {
> +        for (j = 0; j < TARGET_INSN_START_WORDS; ++j) {
> +            data[j] += decode_sleb128(&p);
> +        }
> +        host_pc += decode_sleb128(&p);
> +        if (host_pc > searched_pc) {
> +            goto found;
> +        }
> +    }
> +    return -1;
>  
> + found:
>      if (tb->cflags & CF_USE_ICOUNT) {
>          assert(use_icount);
>          /* Reset the cycle counter to the start of the block.  */
> -        cpu->icount_decr.u16.low += tb->icount;
> +        cpu->icount_decr.u16.low += num_insns;
>          /* Clear the IO flag.  */
>          cpu->can_do_io = 0;
>      }
> -
> -    /* find opc index corresponding to search_pc */
> -    tc_ptr = (uintptr_t)tb->tc_ptr;
> -    if (searched_pc < tc_ptr)
> -        return -1;
> -
> -    s->tb_next_offset = tb->tb_next_offset;
> -#ifdef USE_DIRECT_JUMP
> -    s->tb_jmp_offset = tb->tb_jmp_offset;
> -    s->tb_next = NULL;
> -#else
> -    s->tb_jmp_offset = NULL;
> -    s->tb_next = tb->tb_next;
> -#endif
> -    j = tcg_gen_code_search_pc(s, (tcg_insn_unit *)tc_ptr,
> -                               searched_pc - tc_ptr);
> -    if (j < 0)
> -        return -1;
> -    /* now find start of instruction before */
> -    while (s->gen_opc_instr_start[j] == 0) {
> -        j--;
> -    }
> -    cpu->icount_decr.u16.low -= s->gen_opc_icount[j];
> -
> -    restore_state_to_opc(env, tb, s->gen_opc_data);
> +    cpu->icount_decr.u16.low -= i;
> +    restore_state_to_opc(env, tb, data);
>  
>  #ifdef CONFIG_PROFILER
> -    s->restore_time += profile_getclock() - ti;
> -    s->restore_count++;
> +    tcg_ctx.restore_time += profile_getclock() - ti;
> +    tcg_ctx.restore_count++;
>  #endif
>      return 0;
>  }
> @@ -969,7 +1035,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>      tb_page_addr_t phys_pc, phys_page2;
>      target_ulong virt_page2;
>      tcg_insn_unit *gen_code_buf;
> -    int gen_code_size;
> +    int gen_code_size, search_size;
>  #ifdef CONFIG_PROFILER
>      int64_t ti;
>  #endif
> @@ -1025,11 +1091,13 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>  #endif
>  
>      gen_code_size = tcg_gen_code(&tcg_ctx, gen_code_buf);
> +    search_size = encode_search(tb, (void *)gen_code_buf + gen_code_size);
>  
>  #ifdef CONFIG_PROFILER
>      tcg_ctx.code_time += profile_getclock();
>      tcg_ctx.code_in_len += tb->size;
>      tcg_ctx.code_out_len += gen_code_size;
> +    tcg_ctx.search_out_len += search_size;
>  #endif
>  
>  #ifdef DEBUG_DISAS
> @@ -1041,8 +1109,9 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
>      }
>  #endif
>  
> -    tcg_ctx.code_gen_ptr = (void *)(((uintptr_t)gen_code_buf +
> -            gen_code_size + CODE_GEN_ALIGN - 1) & ~(CODE_GEN_ALIGN - 1));
> +    tcg_ctx.code_gen_ptr = (void *)
> +        ROUND_UP((uintptr_t)gen_code_buf + gen_code_size + search_size,
> +                 CODE_GEN_ALIGN);
>  
>      /* check next page if needed */
>      virt_page2 = (pc + tb->size - 1) & TARGET_PAGE_MASK;

If you fix the coding style issue I mentioned above, you get:

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 21/25] tcg: Remove gen_intermediate_code_pc
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 21/25] tcg: Remove gen_intermediate_code_pc Richard Henderson
@ 2015-09-25 21:11   ` Aurelien Jarno
  0 siblings, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2015-09-25 21:11 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, alex.bennee, qemu-devel

On 2015-09-22 13:25, Richard Henderson wrote:
> It is no longer used, so tidy up everything reached by it.
> This includes the gen_opc_* arrays, the search_pc parameter
> and the inline gen_intermediate_code_internal functions.
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  include/exec/exec-all.h       |  1 -
>  target-alpha/translate.c      | 41 ++++----------------------------
>  target-arm/translate-a64.c    | 30 +++---------------------
>  target-arm/translate.c        | 54 ++++++++-----------------------------------
>  target-arm/translate.h        |  8 ++-----
>  target-cris/translate.c       | 50 +++++----------------------------------
>  target-i386/translate.c       | 49 ++++-----------------------------------
>  target-lm32/translate.c       | 42 ++++-----------------------------
>  target-m68k/translate.c       | 43 ++++------------------------------
>  target-microblaze/translate.c | 40 ++++----------------------------
>  target-mips/translate.c       | 48 ++++----------------------------------
>  target-moxie/translate.c      | 41 ++++----------------------------
>  target-openrisc/translate.c   | 42 ++++-----------------------------
>  target-ppc/translate.c        | 40 ++++----------------------------
>  target-s390x/translate.c      | 44 ++++-------------------------------
>  target-sh4/translate.c        | 43 ++++------------------------------
>  target-sparc/translate.c      | 51 ++++------------------------------------
>  target-tilegx/translate.c     | 41 ++++----------------------------
>  target-tricore/translate.c    | 31 ++++---------------------
>  target-unicore32/translate.c  | 44 ++++-------------------------------
>  target-xtensa/translate.c     | 39 ++++---------------------------
>  tcg/tcg.h                     |  4 ----
>  22 files changed, 90 insertions(+), 736 deletions(-)

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 22/25] tcg: Remove tcg_gen_code_search_pc
  2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 22/25] tcg: Remove tcg_gen_code_search_pc Richard Henderson
@ 2015-09-25 21:11   ` Aurelien Jarno
  0 siblings, 0 replies; 53+ messages in thread
From: Aurelien Jarno @ 2015-09-25 21:11 UTC (permalink / raw)
  To: Richard Henderson; +Cc: peter.maydell, alex.bennee, qemu-devel

On 2015-09-22 13:25, Richard Henderson wrote:
> It's no longer used, so tidy up everything reached by it.
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  tcg/tcg.c | 59 +++++++++++++++++++----------------------------------------
>  tcg/tcg.h |  2 --
>  2 files changed, 19 insertions(+), 42 deletions(-)

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v3 20/25] tcg: Save insn data and use it in cpu_restore_state_from_tb
  2015-09-25 21:10   ` Aurelien Jarno
@ 2015-09-25 23:05     ` Richard Henderson
  0 siblings, 0 replies; 53+ messages in thread
From: Richard Henderson @ 2015-09-25 23:05 UTC (permalink / raw)
  To: Aurelien Jarno; +Cc: peter.maydell, alex.bennee, qemu-devel

On 09/25/2015 02:10 PM, Aurelien Jarno wrote:
>> +        if (more)
>> +          byte |= 0x80;
> 
> You are missing braces here.

Gah.  I thought I fixed that...

> Given we save both the host and the guest PC in this structure, one
> obvious optimization would be to skip saving data for host instructions
> which can not generate exception. It means that all the TCG ops in this
> instruction do not generate exceptions either. We can easily test that
> for all TCG instructions except all by looking at the
> TCG_OPF_SIDE_EFFECTS flag. For the call op, we have to look at the
> TCG_CALL_NO_SIDE_EFFECTS flag, even if it doesn't necessary means the
> helper might generate exception.
> 
> That should significantly save space on load/store architectures. That
> said we can probably do that in a latter time.

Yes, Alex Bennee mentioned this during round 1.  I decided to not try to do
that all at once.

When we do get there, we also have to add an additional column for icount.
It's currently inferred that each entry is 1 insn.  This will expand the size
of the table in any case that every insn might raise an exception, but I expect
the normal case to be a slight decrease.



r~

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2015-09-25 23:05 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-22 20:24 [Qemu-devel] [PATCH v3 00/25] Do away with TB retranslation Richard Henderson
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 01/25] tcg: Rename debug_insn_start to insn_start Richard Henderson
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 02/25] target-*: Unconditionally emit tcg_gen_insn_start Richard Henderson
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 03/25] target-*: Increment num_insns immediately after tcg_gen_insn_start Richard Henderson
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 04/25] target-*: Introduce and use cpu_breakpoint_test Richard Henderson
2015-09-23 19:19   ` Peter Maydell
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 05/25] tcg: Allow extra data to be attached to insn_start Richard Henderson
2015-09-23 14:55   ` Kevin O'Connor
2015-09-23 16:37     ` Richard Henderson
2015-09-23 16:38     ` Richard Henderson
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 06/25] target-arm: Add condexec state " Richard Henderson
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 07/25] target-i386: Add cc_op " Richard Henderson
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 08/25] target-mips: Add delayed branch " Richard Henderson
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 09/25] target-s390x: Add cc_op " Richard Henderson
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 10/25] target-sh4: Add flags " Richard Henderson
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 11/25] target-cris: Mirror gen_opc_pc into insn_start Richard Henderson
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 12/25] target-sparc: Tidy gen_branch_a interface Richard Henderson
2015-09-22 21:23   ` Aurelien Jarno
2015-09-24 19:42   ` Aurelien Jarno
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 13/25] target-sparc: Split out gen_branch_n Richard Henderson
2015-09-24 19:42   ` Aurelien Jarno
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 14/25] target-sparc: Remove gen_opc_jump_pc Richard Henderson
2015-09-24 19:42   ` Aurelien Jarno
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 15/25] target-sparc: Add npc state to insn_start Richard Henderson
2015-09-24 19:42   ` Aurelien Jarno
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 16/25] tcg: Merge cpu_gen_code into tb_gen_code Richard Henderson
2015-09-24 19:48   ` Aurelien Jarno
2015-09-22 20:24 ` [Qemu-devel] [PATCH v3 17/25] target-*: Drop cpu_gen_code define Richard Henderson
2015-09-24 19:49   ` Aurelien Jarno
2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 18/25] tcg: Add TCG_MAX_INSNS Richard Henderson
2015-09-24 20:02   ` Aurelien Jarno
2015-09-24 20:43     ` Richard Henderson
2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 19/25] tcg: Pass data argument to restore_state_to_opc Richard Henderson
2015-09-24 20:11   ` Aurelien Jarno
2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 20/25] tcg: Save insn data and use it in cpu_restore_state_from_tb Richard Henderson
2015-09-23 19:20   ` Peter Maydell
2015-09-25 21:10   ` Aurelien Jarno
2015-09-25 23:05     ` Richard Henderson
2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 21/25] tcg: Remove gen_intermediate_code_pc Richard Henderson
2015-09-25 21:11   ` Aurelien Jarno
2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 22/25] tcg: Remove tcg_gen_code_search_pc Richard Henderson
2015-09-25 21:11   ` Aurelien Jarno
2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 23/25] tcg: Emit prologue to the beginning of code_gen_buffer Richard Henderson
2015-09-23 19:28   ` Peter Maydell
2015-09-23 19:39     ` Richard Henderson
2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 24/25] tcg: Allocate a guard page after code_gen_buffer Richard Henderson
2015-09-23 19:39   ` Peter Maydell
2015-09-23 20:00     ` Richard Henderson
2015-09-23 20:37       ` Peter Maydell
2015-09-23 22:12         ` Richard Henderson
2015-09-22 20:25 ` [Qemu-devel] [PATCH v3 25/25] tcg: Check for overflow via highwater mark Richard Henderson
2015-09-23 19:42   ` Peter Maydell
2015-09-23 20:01     ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.