All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops
@ 2014-11-11 16:24 Richard Henderson
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 1/8] tcg: Move some opcode generation functions out of line Richard Henderson
                   ` (9 more replies)
  0 siblings, 10 replies; 21+ messages in thread
From: Richard Henderson @ 2014-11-11 16:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: aurelien

Currently tcg ops are simply placed in a buffer in order.  Which is
fine until we want to actually do something with the opcode stream,
such as optimize them.  Note the horrible things like call opcodes
needing their argument count both prefixed and postfixed so that we
can iterate across the call either forward or backward.

While I'm changing this, I also move quite a lot of tcg-op.h out of
line.  There is very little benefit to having most of them be inline,
since their arguments are extracted from the guest instructions being
translated, and thus their values are not really predictable.

I chose a cutoff of one function call.  If a tcg-op.h functionconsists
of a single function call, inline it, otherwise move it out of line.

This also removes a bit of boilerplate from each target.

I haven't been able to measure a performance difference with this
patch set.  I wouldn't really expect any, as the complexity level
remains the same.  I simply find the link list significantly more
maintainable.

Of course this isn't intended for the upcoming 2.2 release.

Comments?


r~


Richard Henderson (8):
  tcg: Move some opcode generation functions out of line
  tcg: Reduce ifdefs in tcg-op.c
  tcg: Move emit of INDEX_op_end into gen_tb_end
  tcg: Introduce tcg_op_buf_count and tcg_op_buf_full
  tcg: Put opcodes in a linked list
  tcg: Remove opcodes instead of noping them out
  tcg: Implement insert_op_before
  tcg: Remove unused opcodes

 Makefile.target               |    2 +-
 include/exec/gen-icount.h     |   22 +-
 target-alpha/translate.c      |   16 +-
 target-arm/translate-a64.c    |   10 +-
 target-arm/translate.c        |   10 +-
 target-cris/translate.c       |   15 +-
 target-i386/translate.c       |   11 +-
 target-lm32/translate.c       |   16 +-
 target-m68k/translate.c       |   10 +-
 target-microblaze/translate.c |   22 +-
 target-mips/translate.c       |   10 +-
 target-moxie/translate.c      |   10 +-
 target-openrisc/translate.c   |   15 +-
 target-ppc/translate.c        |   11 +-
 target-s390x/translate.c      |   11 +-
 target-sh4/translate.c        |   10 +-
 target-sparc/translate.c      |   10 +-
 target-tricore/translate.c    |    5 +-
 target-unicore32/translate.c  |   10 +-
 target-xtensa/translate.c     |    8 +-
 tcg/optimize.c                |  307 +++--
 tcg/tcg-op.c                  | 1941 ++++++++++++++++++++++++++++++++
 tcg/tcg-op.h                  | 2488 ++++++-----------------------------------
 tcg/tcg-opc.h                 |    9 -
 tcg/tcg.c                     |  535 +++------
 tcg/tcg.h                     |   72 +-
 tci.c                         |   13 -
 27 files changed, 2761 insertions(+), 2838 deletions(-)
 create mode 100644 tcg/tcg-op.c

-- 
1.9.3

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2.3 1/8] tcg: Move some opcode generation functions out of line
  2014-11-11 16:24 [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Richard Henderson
@ 2014-11-11 16:24 ` Richard Henderson
  2014-11-14 18:01   ` Bastian Koppelmann
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 2/8] tcg: Reduce ifdefs in tcg-op.c Richard Henderson
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Richard Henderson @ 2014-11-11 16:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: aurelien

Some of these functions are really quite large.  We have a number of
things that ought to be circularly dependent, but we duplicated code
to break that chain for the inlines.

This saved 25% of the code size of one of the translators I examined.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 Makefile.target |    2 +-
 tcg/tcg-op.c    | 1978 +++++++++++++++++++++++++++++++++++++++++++
 tcg/tcg-op.h    | 2488 ++++++++-----------------------------------------------
 tcg/tcg.c       |  137 ---
 tcg/tcg.h       |    3 -
 5 files changed, 2339 insertions(+), 2269 deletions(-)
 create mode 100644 tcg/tcg-op.c

diff --git a/Makefile.target b/Makefile.target
index e9ff1ee..58c6ae1 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -83,7 +83,7 @@ all: $(PROGS) stap
 #########################################################
 # cpu emulator library
 obj-y = exec.o translate-all.o cpu-exec.o
-obj-y += tcg/tcg.o tcg/optimize.o
+obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o
 obj-$(CONFIG_TCG_INTERPRETER) += tci.o
 obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
 obj-y += fpu/softfloat.o
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
new file mode 100644
index 0000000..a6fd0a6
--- /dev/null
+++ b/tcg/tcg-op.c
@@ -0,0 +1,1978 @@
+/*
+ * Tiny Code Generator for QEMU
+ *
+ * Copyright (c) 2008 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "tcg.h"
+#include "tcg-op.h"
+
+
+void tcg_gen_op0(TCGContext *ctx, TCGOpcode opc)
+{
+    *ctx->gen_opc_ptr++ = opc;
+}
+
+void tcg_gen_op1(TCGContext *ctx, TCGOpcode opc, TCGArg a1)
+{
+    uint16_t *op = ctx->gen_opc_ptr;
+    TCGArg *opp = ctx->gen_opparam_ptr;
+
+    op[0] = opc;
+    opp[0] = a1;
+
+    ctx->gen_opc_ptr = op + 1;
+    ctx->gen_opparam_ptr = opp + 1;
+}
+
+void tcg_gen_op2(TCGContext *ctx, TCGOpcode opc, TCGArg a1, TCGArg a2)
+{
+    uint16_t *op = ctx->gen_opc_ptr;
+    TCGArg *opp = ctx->gen_opparam_ptr;
+
+    op[0] = opc;
+    opp[0] = a1;
+    opp[1] = a2;
+
+    ctx->gen_opc_ptr = op + 1;
+    ctx->gen_opparam_ptr = opp + 2;
+}
+
+void tcg_gen_op3(TCGContext *ctx, TCGOpcode opc, TCGArg a1,
+                 TCGArg a2, TCGArg a3)
+{
+    uint16_t *op = ctx->gen_opc_ptr;
+    TCGArg *opp = ctx->gen_opparam_ptr;
+
+    op[0] = opc;
+    opp[0] = a1;
+    opp[1] = a2;
+    opp[2] = a3;
+
+    ctx->gen_opc_ptr = op + 1;
+    ctx->gen_opparam_ptr = opp + 3;
+}
+
+void tcg_gen_op4(TCGContext *ctx, TCGOpcode opc, TCGArg a1,
+                 TCGArg a2, TCGArg a3, TCGArg a4)
+{
+    uint16_t *op = ctx->gen_opc_ptr;
+    TCGArg *opp = ctx->gen_opparam_ptr;
+
+    op[0] = opc;
+    opp[0] = a1;
+    opp[1] = a2;
+    opp[2] = a3;
+    opp[3] = a4;
+
+    ctx->gen_opc_ptr = op + 1;
+    ctx->gen_opparam_ptr = opp + 4;
+}
+
+void tcg_gen_op5(TCGContext *ctx, TCGOpcode opc, TCGArg a1,
+                 TCGArg a2, TCGArg a3, TCGArg a4, TCGArg a5)
+{
+    uint16_t *op = ctx->gen_opc_ptr;
+    TCGArg *opp = ctx->gen_opparam_ptr;
+
+    op[0] = opc;
+    opp[0] = a1;
+    opp[1] = a2;
+    opp[2] = a3;
+    opp[3] = a4;
+    opp[4] = a5;
+
+    ctx->gen_opc_ptr = op + 1;
+    ctx->gen_opparam_ptr = opp + 5;
+}
+
+void tcg_gen_op6(TCGContext *ctx, TCGOpcode opc, TCGArg a1, TCGArg a2,
+                 TCGArg a3, TCGArg a4, TCGArg a5, TCGArg a6)
+{
+    uint16_t *op = ctx->gen_opc_ptr;
+    TCGArg *opp = ctx->gen_opparam_ptr;
+
+    op[0] = opc;
+    opp[0] = a1;
+    opp[1] = a2;
+    opp[2] = a3;
+    opp[3] = a4;
+    opp[4] = a5;
+    opp[5] = a6;
+
+    ctx->gen_opc_ptr = op + 1;
+    ctx->gen_opparam_ptr = opp + 6;
+}
+
+/* 32 bit ops */
+
+void tcg_gen_addi_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
+{
+    /* some cases can be optimized here */
+    if (arg2 == 0) {
+        tcg_gen_mov_i32(ret, arg1);
+    } else {
+        TCGv_i32 t0 = tcg_const_i32(arg2);
+        tcg_gen_add_i32(ret, arg1, t0);
+        tcg_temp_free_i32(t0);
+    }
+}
+
+void tcg_gen_subfi_i32(TCGv_i32 ret, int32_t arg1, TCGv_i32 arg2)
+{
+    if (arg1 == 0 && TCG_TARGET_HAS_neg_i32) {
+        /* Don't recurse with tcg_gen_neg_i32.  */
+        tcg_gen_op2_i32(INDEX_op_neg_i32, ret, arg2);
+    } else {
+        TCGv_i32 t0 = tcg_const_i32(arg1);
+        tcg_gen_sub_i32(ret, t0, arg2);
+        tcg_temp_free_i32(t0);
+    }
+}
+
+void tcg_gen_subi_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
+{
+    /* some cases can be optimized here */
+    if (arg2 == 0) {
+        tcg_gen_mov_i32(ret, arg1);
+    } else {
+        TCGv_i32 t0 = tcg_const_i32(arg2);
+        tcg_gen_sub_i32(ret, arg1, t0);
+        tcg_temp_free_i32(t0);
+    }
+}
+
+void tcg_gen_andi_i32(TCGv_i32 ret, TCGv_i32 arg1, uint32_t arg2)
+{
+    TCGv_i32 t0;
+    /* Some cases can be optimized here.  */
+    switch (arg2) {
+    case 0:
+        tcg_gen_movi_i32(ret, 0);
+        return;
+    case 0xffffffffu:
+        tcg_gen_mov_i32(ret, arg1);
+        return;
+    case 0xffu:
+        /* Don't recurse with tcg_gen_ext8u_i32.  */
+        if (TCG_TARGET_HAS_ext8u_i32) {
+            tcg_gen_op2_i32(INDEX_op_ext8u_i32, ret, arg1);
+            return;
+        }
+        break;
+    case 0xffffu:
+        if (TCG_TARGET_HAS_ext16u_i32) {
+            tcg_gen_op2_i32(INDEX_op_ext16u_i32, ret, arg1);
+            return;
+        }
+        break;
+    }
+    t0 = tcg_const_i32(arg2);
+    tcg_gen_and_i32(ret, arg1, t0);
+    tcg_temp_free_i32(t0);
+}
+
+void tcg_gen_ori_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
+{
+    /* Some cases can be optimized here.  */
+    if (arg2 == -1) {
+        tcg_gen_movi_i32(ret, -1);
+    } else if (arg2 == 0) {
+        tcg_gen_mov_i32(ret, arg1);
+    } else {
+        TCGv_i32 t0 = tcg_const_i32(arg2);
+        tcg_gen_or_i32(ret, arg1, t0);
+        tcg_temp_free_i32(t0);
+    }
+}
+
+void tcg_gen_xori_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
+{
+    /* Some cases can be optimized here.  */
+    if (arg2 == 0) {
+        tcg_gen_mov_i32(ret, arg1);
+    } else if (arg2 == -1 && TCG_TARGET_HAS_not_i32) {
+        /* Don't recurse with tcg_gen_not_i32.  */
+        tcg_gen_op2_i32(INDEX_op_not_i32, ret, arg1);
+    } else {
+        TCGv_i32 t0 = tcg_const_i32(arg2);
+        tcg_gen_xor_i32(ret, arg1, t0);
+        tcg_temp_free_i32(t0);
+    }
+}
+
+void tcg_gen_shli_i32(TCGv_i32 ret, TCGv_i32 arg1, unsigned arg2)
+{
+    tcg_debug_assert(arg2 < 32);
+    if (arg2 == 0) {
+        tcg_gen_mov_i32(ret, arg1);
+    } else {
+        TCGv_i32 t0 = tcg_const_i32(arg2);
+        tcg_gen_shl_i32(ret, arg1, t0);
+        tcg_temp_free_i32(t0);
+    }
+}
+
+void tcg_gen_shri_i32(TCGv_i32 ret, TCGv_i32 arg1, unsigned arg2)
+{
+    tcg_debug_assert(arg2 < 32);
+    if (arg2 == 0) {
+        tcg_gen_mov_i32(ret, arg1);
+    } else {
+        TCGv_i32 t0 = tcg_const_i32(arg2);
+        tcg_gen_shr_i32(ret, arg1, t0);
+        tcg_temp_free_i32(t0);
+    }
+}
+
+void tcg_gen_sari_i32(TCGv_i32 ret, TCGv_i32 arg1, unsigned arg2)
+{
+    tcg_debug_assert(arg2 < 32);
+    if (arg2 == 0) {
+        tcg_gen_mov_i32(ret, arg1);
+    } else {
+        TCGv_i32 t0 = tcg_const_i32(arg2);
+        tcg_gen_sar_i32(ret, arg1, t0);
+        tcg_temp_free_i32(t0);
+    }
+}
+
+void tcg_gen_brcond_i32(TCGCond cond, TCGv_i32 arg1, TCGv_i32 arg2, int label)
+{
+    if (cond == TCG_COND_ALWAYS) {
+        tcg_gen_br(label);
+    } else if (cond != TCG_COND_NEVER) {
+        tcg_gen_op4ii_i32(INDEX_op_brcond_i32, arg1, arg2, cond, label);
+    }
+}
+
+void tcg_gen_brcondi_i32(TCGCond cond, TCGv_i32 arg1, int32_t arg2, int label)
+{
+    TCGv_i32 t0 = tcg_const_i32(arg2);
+    tcg_gen_brcond_i32(cond, arg1, t0, label);
+    tcg_temp_free_i32(t0);
+}
+
+void tcg_gen_setcond_i32(TCGCond cond, TCGv_i32 ret,
+                         TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (cond == TCG_COND_ALWAYS) {
+        tcg_gen_movi_i32(ret, 1);
+    } else if (cond == TCG_COND_NEVER) {
+        tcg_gen_movi_i32(ret, 0);
+    } else {
+        tcg_gen_op4i_i32(INDEX_op_setcond_i32, ret, arg1, arg2, cond);
+    }
+}
+
+void tcg_gen_setcondi_i32(TCGCond cond, TCGv_i32 ret,
+                          TCGv_i32 arg1, int32_t arg2)
+{
+    TCGv_i32 t0 = tcg_const_i32(arg2);
+    tcg_gen_setcond_i32(cond, ret, arg1, t0);
+    tcg_temp_free_i32(t0);
+}
+
+void tcg_gen_muli_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
+{
+    TCGv_i32 t0 = tcg_const_i32(arg2);
+    tcg_gen_mul_i32(ret, arg1, t0);
+    tcg_temp_free_i32(t0);
+}
+
+void tcg_gen_div_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (TCG_TARGET_HAS_div_i32) {
+        tcg_gen_op3_i32(INDEX_op_div_i32, ret, arg1, arg2);
+    } else if (TCG_TARGET_HAS_div2_i32) {
+        TCGv_i32 t0 = tcg_temp_new_i32();
+        tcg_gen_sari_i32(t0, arg1, 31);
+        tcg_gen_op5_i32(INDEX_op_div2_i32, ret, t0, arg1, t0, arg2);
+        tcg_temp_free_i32(t0);
+    } else {
+        gen_helper_div_i32(ret, arg1, arg2);
+    }
+}
+
+void tcg_gen_rem_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (TCG_TARGET_HAS_rem_i32) {
+        tcg_gen_op3_i32(INDEX_op_rem_i32, ret, arg1, arg2);
+    } else if (TCG_TARGET_HAS_div_i32) {
+        TCGv_i32 t0 = tcg_temp_new_i32();
+        tcg_gen_op3_i32(INDEX_op_div_i32, t0, arg1, arg2);
+        tcg_gen_mul_i32(t0, t0, arg2);
+        tcg_gen_sub_i32(ret, arg1, t0);
+        tcg_temp_free_i32(t0);
+    } else if (TCG_TARGET_HAS_div2_i32) {
+        TCGv_i32 t0 = tcg_temp_new_i32();
+        tcg_gen_sari_i32(t0, arg1, 31);
+        tcg_gen_op5_i32(INDEX_op_div2_i32, t0, ret, arg1, t0, arg2);
+        tcg_temp_free_i32(t0);
+    } else {
+        gen_helper_rem_i32(ret, arg1, arg2);
+    }
+}
+
+void tcg_gen_divu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (TCG_TARGET_HAS_div_i32) {
+        tcg_gen_op3_i32(INDEX_op_divu_i32, ret, arg1, arg2);
+    } else if (TCG_TARGET_HAS_div2_i32) {
+        TCGv_i32 t0 = tcg_temp_new_i32();
+        tcg_gen_movi_i32(t0, 0);
+        tcg_gen_op5_i32(INDEX_op_divu2_i32, ret, t0, arg1, t0, arg2);
+        tcg_temp_free_i32(t0);
+    } else {
+        gen_helper_divu_i32(ret, arg1, arg2);
+    }
+}
+
+void tcg_gen_remu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (TCG_TARGET_HAS_rem_i32) {
+        tcg_gen_op3_i32(INDEX_op_remu_i32, ret, arg1, arg2);
+    } else if (TCG_TARGET_HAS_div_i32) {
+        TCGv_i32 t0 = tcg_temp_new_i32();
+        tcg_gen_op3_i32(INDEX_op_divu_i32, t0, arg1, arg2);
+        tcg_gen_mul_i32(t0, t0, arg2);
+        tcg_gen_sub_i32(ret, arg1, t0);
+        tcg_temp_free_i32(t0);
+    } else if (TCG_TARGET_HAS_div2_i32) {
+        TCGv_i32 t0 = tcg_temp_new_i32();
+        tcg_gen_movi_i32(t0, 0);
+        tcg_gen_op5_i32(INDEX_op_divu2_i32, t0, ret, arg1, t0, arg2);
+        tcg_temp_free_i32(t0);
+    } else {
+        gen_helper_remu_i32(ret, arg1, arg2);
+    }
+}
+
+void tcg_gen_andc_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (TCG_TARGET_HAS_andc_i32) {
+        tcg_gen_op3_i32(INDEX_op_andc_i32, ret, arg1, arg2);
+    } else {
+        TCGv_i32 t0 = tcg_temp_new_i32();
+        tcg_gen_not_i32(t0, arg2);
+        tcg_gen_and_i32(ret, arg1, t0);
+        tcg_temp_free_i32(t0);
+    }
+}
+
+void tcg_gen_eqv_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (TCG_TARGET_HAS_eqv_i32) {
+        tcg_gen_op3_i32(INDEX_op_eqv_i32, ret, arg1, arg2);
+    } else {
+        tcg_gen_xor_i32(ret, arg1, arg2);
+        tcg_gen_not_i32(ret, ret);
+    }
+}
+
+void tcg_gen_nand_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (TCG_TARGET_HAS_nand_i32) {
+        tcg_gen_op3_i32(INDEX_op_nand_i32, ret, arg1, arg2);
+    } else {
+        tcg_gen_and_i32(ret, arg1, arg2);
+        tcg_gen_not_i32(ret, ret);
+    }
+}
+
+void tcg_gen_nor_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (TCG_TARGET_HAS_nor_i32) {
+        tcg_gen_op3_i32(INDEX_op_nor_i32, ret, arg1, arg2);
+    } else {
+        tcg_gen_or_i32(ret, arg1, arg2);
+        tcg_gen_not_i32(ret, ret);
+    }
+}
+
+void tcg_gen_orc_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (TCG_TARGET_HAS_orc_i32) {
+        tcg_gen_op3_i32(INDEX_op_orc_i32, ret, arg1, arg2);
+    } else {
+        TCGv_i32 t0 = tcg_temp_new_i32();
+        tcg_gen_not_i32(t0, arg2);
+        tcg_gen_or_i32(ret, arg1, t0);
+        tcg_temp_free_i32(t0);
+    }
+}
+
+void tcg_gen_rotl_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (TCG_TARGET_HAS_rot_i32) {
+        tcg_gen_op3_i32(INDEX_op_rotl_i32, ret, arg1, arg2);
+    } else {
+        TCGv_i32 t0, t1;
+
+        t0 = tcg_temp_new_i32();
+        t1 = tcg_temp_new_i32();
+        tcg_gen_shl_i32(t0, arg1, arg2);
+        tcg_gen_subfi_i32(t1, 32, arg2);
+        tcg_gen_shr_i32(t1, arg1, t1);
+        tcg_gen_or_i32(ret, t0, t1);
+        tcg_temp_free_i32(t0);
+        tcg_temp_free_i32(t1);
+    }
+}
+
+void tcg_gen_rotli_i32(TCGv_i32 ret, TCGv_i32 arg1, unsigned arg2)
+{
+    tcg_debug_assert(arg2 < 32);
+    /* some cases can be optimized here */
+    if (arg2 == 0) {
+        tcg_gen_mov_i32(ret, arg1);
+    } else if (TCG_TARGET_HAS_rot_i32) {
+        TCGv_i32 t0 = tcg_const_i32(arg2);
+        tcg_gen_rotl_i32(ret, arg1, t0);
+        tcg_temp_free_i32(t0);
+    } else {
+        TCGv_i32 t0, t1;
+        t0 = tcg_temp_new_i32();
+        t1 = tcg_temp_new_i32();
+        tcg_gen_shli_i32(t0, arg1, arg2);
+        tcg_gen_shri_i32(t1, arg1, 32 - arg2);
+        tcg_gen_or_i32(ret, t0, t1);
+        tcg_temp_free_i32(t0);
+        tcg_temp_free_i32(t1);
+    }
+}
+
+void tcg_gen_rotr_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (TCG_TARGET_HAS_rot_i32) {
+        tcg_gen_op3_i32(INDEX_op_rotr_i32, ret, arg1, arg2);
+    } else {
+        TCGv_i32 t0, t1;
+
+        t0 = tcg_temp_new_i32();
+        t1 = tcg_temp_new_i32();
+        tcg_gen_shr_i32(t0, arg1, arg2);
+        tcg_gen_subfi_i32(t1, 32, arg2);
+        tcg_gen_shl_i32(t1, arg1, t1);
+        tcg_gen_or_i32(ret, t0, t1);
+        tcg_temp_free_i32(t0);
+        tcg_temp_free_i32(t1);
+    }
+}
+
+void tcg_gen_rotri_i32(TCGv_i32 ret, TCGv_i32 arg1, unsigned arg2)
+{
+    tcg_debug_assert(arg2 < 32);
+    /* some cases can be optimized here */
+    if (arg2 == 0) {
+        tcg_gen_mov_i32(ret, arg1);
+    } else {
+        tcg_gen_rotli_i32(ret, arg1, 32 - arg2);
+    }
+}
+
+void tcg_gen_deposit_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2,
+                         unsigned int ofs, unsigned int len)
+{
+    uint32_t mask;
+    TCGv_i32 t1;
+
+    tcg_debug_assert(ofs < 32);
+    tcg_debug_assert(len <= 32);
+    tcg_debug_assert(ofs + len <= 32);
+
+    if (ofs == 0 && len == 32) {
+        tcg_gen_mov_i32(ret, arg2);
+        return;
+    }
+    if (TCG_TARGET_HAS_deposit_i32 && TCG_TARGET_deposit_i32_valid(ofs, len)) {
+        tcg_gen_op5ii_i32(INDEX_op_deposit_i32, ret, arg1, arg2, ofs, len);
+        return;
+    }
+
+    mask = (1u << len) - 1;
+    t1 = tcg_temp_new_i32();
+
+    if (ofs + len < 32) {
+        tcg_gen_andi_i32(t1, arg2, mask);
+        tcg_gen_shli_i32(t1, t1, ofs);
+    } else {
+        tcg_gen_shli_i32(t1, arg2, ofs);
+    }
+    tcg_gen_andi_i32(ret, arg1, ~(mask << ofs));
+    tcg_gen_or_i32(ret, ret, t1);
+
+    tcg_temp_free_i32(t1);
+}
+
+void tcg_gen_movcond_i32(TCGCond cond, TCGv_i32 ret, TCGv_i32 c1,
+                         TCGv_i32 c2, TCGv_i32 v1, TCGv_i32 v2)
+{
+    if (TCG_TARGET_HAS_movcond_i32) {
+        tcg_gen_op6i_i32(INDEX_op_movcond_i32, ret, c1, c2, v1, v2, cond);
+    } else {
+        TCGv_i32 t0 = tcg_temp_new_i32();
+        TCGv_i32 t1 = tcg_temp_new_i32();
+        tcg_gen_setcond_i32(cond, t0, c1, c2);
+        tcg_gen_neg_i32(t0, t0);
+        tcg_gen_and_i32(t1, v1, t0);
+        tcg_gen_andc_i32(ret, v2, t0);
+        tcg_gen_or_i32(ret, ret, t1);
+        tcg_temp_free_i32(t0);
+        tcg_temp_free_i32(t1);
+    }
+}
+
+void tcg_gen_add2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 al,
+                      TCGv_i32 ah, TCGv_i32 bl, TCGv_i32 bh)
+{
+    if (TCG_TARGET_HAS_add2_i32) {
+        tcg_gen_op6_i32(INDEX_op_add2_i32, rl, rh, al, ah, bl, bh);
+        /* Allow the optimizer room to replace add2 with two moves.  */
+        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
+    } else {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        TCGv_i64 t1 = tcg_temp_new_i64();
+        tcg_gen_concat_i32_i64(t0, al, ah);
+        tcg_gen_concat_i32_i64(t1, bl, bh);
+        tcg_gen_add_i64(t0, t0, t1);
+        tcg_gen_extr_i64_i32(rl, rh, t0);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+    }
+}
+
+void tcg_gen_sub2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 al,
+                      TCGv_i32 ah, TCGv_i32 bl, TCGv_i32 bh)
+{
+    if (TCG_TARGET_HAS_sub2_i32) {
+        tcg_gen_op6_i32(INDEX_op_sub2_i32, rl, rh, al, ah, bl, bh);
+        /* Allow the optimizer room to replace sub2 with two moves.  */
+        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
+    } else {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        TCGv_i64 t1 = tcg_temp_new_i64();
+        tcg_gen_concat_i32_i64(t0, al, ah);
+        tcg_gen_concat_i32_i64(t1, bl, bh);
+        tcg_gen_sub_i64(t0, t0, t1);
+        tcg_gen_extr_i64_i32(rl, rh, t0);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+    }
+}
+
+void tcg_gen_mulu2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (TCG_TARGET_HAS_mulu2_i32) {
+        tcg_gen_op4_i32(INDEX_op_mulu2_i32, rl, rh, arg1, arg2);
+        /* Allow the optimizer room to replace mulu2 with two moves.  */
+        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
+    } else if (TCG_TARGET_HAS_muluh_i32) {
+        TCGv_i32 t = tcg_temp_new_i32();
+        tcg_gen_op3_i32(INDEX_op_mul_i32, t, arg1, arg2);
+        tcg_gen_op3_i32(INDEX_op_muluh_i32, rh, arg1, arg2);
+        tcg_gen_mov_i32(rl, t);
+        tcg_temp_free_i32(t);
+    } else {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        TCGv_i64 t1 = tcg_temp_new_i64();
+        tcg_gen_extu_i32_i64(t0, arg1);
+        tcg_gen_extu_i32_i64(t1, arg2);
+        tcg_gen_mul_i64(t0, t0, t1);
+        tcg_gen_extr_i64_i32(rl, rh, t0);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+    }
+}
+
+void tcg_gen_muls2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 arg1, TCGv_i32 arg2)
+{
+    if (TCG_TARGET_HAS_muls2_i32) {
+        tcg_gen_op4_i32(INDEX_op_muls2_i32, rl, rh, arg1, arg2);
+        /* Allow the optimizer room to replace muls2 with two moves.  */
+        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
+    } else if (TCG_TARGET_HAS_mulsh_i32) {
+        TCGv_i32 t = tcg_temp_new_i32();
+        tcg_gen_op3_i32(INDEX_op_mul_i32, t, arg1, arg2);
+        tcg_gen_op3_i32(INDEX_op_mulsh_i32, rh, arg1, arg2);
+        tcg_gen_mov_i32(rl, t);
+        tcg_temp_free_i32(t);
+    } else if (TCG_TARGET_REG_BITS == 32) {
+        TCGv_i32 t0 = tcg_temp_new_i32();
+        TCGv_i32 t1 = tcg_temp_new_i32();
+        TCGv_i32 t2 = tcg_temp_new_i32();
+        TCGv_i32 t3 = tcg_temp_new_i32();
+        tcg_gen_mulu2_i32(t0, t1, arg1, arg2);
+        /* Adjust for negative inputs.  */
+        tcg_gen_sari_i32(t2, arg1, 31);
+        tcg_gen_sari_i32(t3, arg2, 31);
+        tcg_gen_and_i32(t2, t2, arg2);
+        tcg_gen_and_i32(t3, t3, arg1);
+        tcg_gen_sub_i32(rh, t1, t2);
+        tcg_gen_sub_i32(rh, rh, t3);
+        tcg_gen_mov_i32(rl, t0);
+        tcg_temp_free_i32(t0);
+        tcg_temp_free_i32(t1);
+        tcg_temp_free_i32(t2);
+        tcg_temp_free_i32(t3);
+    } else {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        TCGv_i64 t1 = tcg_temp_new_i64();
+        tcg_gen_ext_i32_i64(t0, arg1);
+        tcg_gen_ext_i32_i64(t1, arg2);
+        tcg_gen_mul_i64(t0, t0, t1);
+        tcg_gen_extr_i64_i32(rl, rh, t0);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+    }
+}
+
+void tcg_gen_ext8s_i32(TCGv_i32 ret, TCGv_i32 arg)
+{
+    if (TCG_TARGET_HAS_ext8s_i32) {
+        tcg_gen_op2_i32(INDEX_op_ext8s_i32, ret, arg);
+    } else {
+        tcg_gen_shli_i32(ret, arg, 24);
+        tcg_gen_sari_i32(ret, ret, 24);
+    }
+}
+
+void tcg_gen_ext16s_i32(TCGv_i32 ret, TCGv_i32 arg)
+{
+    if (TCG_TARGET_HAS_ext16s_i32) {
+        tcg_gen_op2_i32(INDEX_op_ext16s_i32, ret, arg);
+    } else {
+        tcg_gen_shli_i32(ret, arg, 16);
+        tcg_gen_sari_i32(ret, ret, 16);
+    }
+}
+
+void tcg_gen_ext8u_i32(TCGv_i32 ret, TCGv_i32 arg)
+{
+    if (TCG_TARGET_HAS_ext8u_i32) {
+        tcg_gen_op2_i32(INDEX_op_ext8u_i32, ret, arg);
+    } else {
+        tcg_gen_andi_i32(ret, arg, 0xffu);
+    }
+}
+
+void tcg_gen_ext16u_i32(TCGv_i32 ret, TCGv_i32 arg)
+{
+    if (TCG_TARGET_HAS_ext16u_i32) {
+        tcg_gen_op2_i32(INDEX_op_ext16u_i32, ret, arg);
+    } else {
+        tcg_gen_andi_i32(ret, arg, 0xffffu);
+    }
+}
+
+/* Note: we assume the two high bytes are set to zero */
+void tcg_gen_bswap16_i32(TCGv_i32 ret, TCGv_i32 arg)
+{
+    if (TCG_TARGET_HAS_bswap16_i32) {
+        tcg_gen_op2_i32(INDEX_op_bswap16_i32, ret, arg);
+    } else {
+        TCGv_i32 t0 = tcg_temp_new_i32();
+
+        tcg_gen_ext8u_i32(t0, arg);
+        tcg_gen_shli_i32(t0, t0, 8);
+        tcg_gen_shri_i32(ret, arg, 8);
+        tcg_gen_or_i32(ret, ret, t0);
+        tcg_temp_free_i32(t0);
+    }
+}
+
+void tcg_gen_bswap32_i32(TCGv_i32 ret, TCGv_i32 arg)
+{
+    if (TCG_TARGET_HAS_bswap32_i32) {
+        tcg_gen_op2_i32(INDEX_op_bswap32_i32, ret, arg);
+    } else {
+        TCGv_i32 t0, t1;
+        t0 = tcg_temp_new_i32();
+        t1 = tcg_temp_new_i32();
+
+        tcg_gen_shli_i32(t0, arg, 24);
+
+        tcg_gen_andi_i32(t1, arg, 0x0000ff00);
+        tcg_gen_shli_i32(t1, t1, 8);
+        tcg_gen_or_i32(t0, t0, t1);
+
+        tcg_gen_shri_i32(t1, arg, 8);
+        tcg_gen_andi_i32(t1, t1, 0x0000ff00);
+        tcg_gen_or_i32(t0, t0, t1);
+
+        tcg_gen_shri_i32(t1, arg, 24);
+        tcg_gen_or_i32(ret, t0, t1);
+        tcg_temp_free_i32(t0);
+        tcg_temp_free_i32(t1);
+    }
+}
+
+/* 64-bit ops */
+
+#if TCG_TARGET_REG_BITS == 32
+/* These are all inline for TCG_TARGET_REG_BITS == 64.  */
+
+void tcg_gen_discard_i64(TCGv_i64 arg)
+{
+    tcg_gen_discard_i32(TCGV_LOW(arg));
+    tcg_gen_discard_i32(TCGV_HIGH(arg));
+}
+
+void tcg_gen_mov_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+    tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+    tcg_gen_mov_i32(TCGV_HIGH(ret), TCGV_HIGH(arg));
+}
+
+void tcg_gen_movi_i64(TCGv_i64 ret, int64_t arg)
+{
+    tcg_gen_movi_i32(TCGV_LOW(ret), arg);
+    tcg_gen_movi_i32(TCGV_HIGH(ret), arg >> 32);
+}
+
+void tcg_gen_ld8u_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset)
+{
+    tcg_gen_ld8u_i32(TCGV_LOW(ret), arg2, offset);
+    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+}
+
+void tcg_gen_ld8s_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset)
+{
+    tcg_gen_ld8s_i32(TCGV_LOW(ret), arg2, offset);
+    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_HIGH(ret), 31);
+}
+
+void tcg_gen_ld16u_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset)
+{
+    tcg_gen_ld16u_i32(TCGV_LOW(ret), arg2, offset);
+    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+}
+
+void tcg_gen_ld16s_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset)
+{
+    tcg_gen_ld16s_i32(TCGV_LOW(ret), arg2, offset);
+    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+}
+
+void tcg_gen_ld32u_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset)
+{
+    tcg_gen_ld_i32(TCGV_LOW(ret), arg2, offset);
+    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+}
+
+void tcg_gen_ld32s_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset)
+{
+    tcg_gen_ld_i32(TCGV_LOW(ret), arg2, offset);
+    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+}
+
+void tcg_gen_ld_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset)
+{
+    /* Since arg2 and ret have different types,
+       they cannot be the same temporary */
+#ifdef TCG_TARGET_WORDS_BIGENDIAN
+    tcg_gen_ld_i32(TCGV_HIGH(ret), arg2, offset);
+    tcg_gen_ld_i32(TCGV_LOW(ret), arg2, offset + 4);
+#else
+    tcg_gen_ld_i32(TCGV_LOW(ret), arg2, offset);
+    tcg_gen_ld_i32(TCGV_HIGH(ret), arg2, offset + 4);
+#endif
+}
+
+void tcg_gen_st_i64(TCGv_i64 arg1, TCGv_ptr arg2, tcg_target_long offset)
+{
+#ifdef TCG_TARGET_WORDS_BIGENDIAN
+    tcg_gen_st_i32(TCGV_HIGH(arg1), arg2, offset);
+    tcg_gen_st_i32(TCGV_LOW(arg1), arg2, offset + 4);
+#else
+    tcg_gen_st_i32(TCGV_LOW(arg1), arg2, offset);
+    tcg_gen_st_i32(TCGV_HIGH(arg1), arg2, offset + 4);
+#endif
+}
+
+void tcg_gen_and_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    tcg_gen_and_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
+    tcg_gen_and_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
+}
+
+void tcg_gen_or_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    tcg_gen_or_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
+    tcg_gen_or_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
+}
+
+void tcg_gen_xor_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    tcg_gen_xor_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
+    tcg_gen_xor_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
+}
+
+void tcg_gen_shl_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    gen_helper_shl_i64(ret, arg1, arg2);
+}
+
+void tcg_gen_shr_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    gen_helper_shr_i64(ret, arg1, arg2);
+}
+
+void tcg_gen_sar_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    gen_helper_sar_i64(ret, arg1, arg2);
+}
+
+void tcg_gen_mul_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    TCGv_i64 t0;
+    TCGv_i32 t1;
+
+    t0 = tcg_temp_new_i64();
+    t1 = tcg_temp_new_i32();
+
+    tcg_gen_mulu2_i32(TCGV_LOW(t0), TCGV_HIGH(t0),
+                      TCGV_LOW(arg1), TCGV_LOW(arg2));
+
+    tcg_gen_mul_i32(t1, TCGV_LOW(arg1), TCGV_HIGH(arg2));
+    tcg_gen_add_i32(TCGV_HIGH(t0), TCGV_HIGH(t0), t1);
+    tcg_gen_mul_i32(t1, TCGV_HIGH(arg1), TCGV_LOW(arg2));
+    tcg_gen_add_i32(TCGV_HIGH(t0), TCGV_HIGH(t0), t1);
+
+    tcg_gen_mov_i64(ret, t0);
+    tcg_temp_free_i64(t0);
+    tcg_temp_free_i32(t1);
+}
+#endif /* TCG_TARGET_REG_SIZE == 32 */
+
+void tcg_gen_addi_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
+{
+    /* some cases can be optimized here */
+    if (arg2 == 0) {
+        tcg_gen_mov_i64(ret, arg1);
+    } else {
+        TCGv_i64 t0 = tcg_const_i64(arg2);
+        tcg_gen_add_i64(ret, arg1, t0);
+        tcg_temp_free_i64(t0);
+    }
+}
+
+void tcg_gen_subfi_i64(TCGv_i64 ret, int64_t arg1, TCGv_i64 arg2)
+{
+    if (arg1 == 0 && TCG_TARGET_HAS_neg_i64) {
+        /* Don't recurse with tcg_gen_neg_i64.  */
+        tcg_gen_op2_i64(INDEX_op_neg_i64, ret, arg2);
+    } else {
+        TCGv_i64 t0 = tcg_const_i64(arg1);
+        tcg_gen_sub_i64(ret, t0, arg2);
+        tcg_temp_free_i64(t0);
+    }
+}
+
+void tcg_gen_subi_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
+{
+    /* some cases can be optimized here */
+    if (arg2 == 0) {
+        tcg_gen_mov_i64(ret, arg1);
+    } else {
+        TCGv_i64 t0 = tcg_const_i64(arg2);
+        tcg_gen_sub_i64(ret, arg1, t0);
+        tcg_temp_free_i64(t0);
+    }
+}
+
+void tcg_gen_andi_i64(TCGv_i64 ret, TCGv_i64 arg1, uint64_t arg2)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_andi_i32(TCGV_LOW(ret), TCGV_LOW(arg1), arg2);
+    tcg_gen_andi_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), arg2 >> 32);
+#else
+    TCGv_i64 t0;
+    /* Some cases can be optimized here.  */
+    switch (arg2) {
+    case 0:
+        tcg_gen_movi_i64(ret, 0);
+        return;
+    case 0xffffffffffffffffull:
+        tcg_gen_mov_i64(ret, arg1);
+        return;
+    case 0xffull:
+        /* Don't recurse with tcg_gen_ext8u_i64.  */
+        if (TCG_TARGET_HAS_ext8u_i64) {
+            tcg_gen_op2_i64(INDEX_op_ext8u_i64, ret, arg1);
+            return;
+        }
+        break;
+    case 0xffffu:
+        if (TCG_TARGET_HAS_ext16u_i64) {
+            tcg_gen_op2_i64(INDEX_op_ext16u_i64, ret, arg1);
+            return;
+        }
+        break;
+    case 0xffffffffull:
+        if (TCG_TARGET_HAS_ext32u_i64) {
+            tcg_gen_op2_i64(INDEX_op_ext32u_i64, ret, arg1);
+            return;
+        }
+        break;
+    }
+    t0 = tcg_const_i64(arg2);
+    tcg_gen_and_i64(ret, arg1, t0);
+    tcg_temp_free_i64(t0);
+#endif
+}
+
+void tcg_gen_ori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_ori_i32(TCGV_LOW(ret), TCGV_LOW(arg1), arg2);
+    tcg_gen_ori_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), arg2 >> 32);
+#else
+    /* Some cases can be optimized here.  */
+    if (arg2 == -1) {
+        tcg_gen_movi_i64(ret, -1);
+    } else if (arg2 == 0) {
+        tcg_gen_mov_i64(ret, arg1);
+    } else {
+        TCGv_i64 t0 = tcg_const_i64(arg2);
+        tcg_gen_or_i64(ret, arg1, t0);
+        tcg_temp_free_i64(t0);
+    }
+#endif
+}
+
+void tcg_gen_xori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_xori_i32(TCGV_LOW(ret), TCGV_LOW(arg1), arg2);
+    tcg_gen_xori_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), arg2 >> 32);
+#else
+    /* Some cases can be optimized here.  */
+    if (arg2 == 0) {
+        tcg_gen_mov_i64(ret, arg1);
+    } else if (arg2 == -1 && TCG_TARGET_HAS_not_i64) {
+        /* Don't recurse with tcg_gen_not_i64.  */
+        tcg_gen_op2_i64(INDEX_op_not_i64, ret, arg1);
+    } else {
+        TCGv_i64 t0 = tcg_const_i64(arg2);
+        tcg_gen_xor_i64(ret, arg1, t0);
+        tcg_temp_free_i64(t0);
+    }
+#endif
+}
+
+#if TCG_TARGET_REG_BITS == 32
+static inline void tcg_gen_shifti_i64(TCGv_i64 ret, TCGv_i64 arg1,
+                                      unsigned c, bool right, bool arith)
+{
+    tcg_debug_assert(c < 64);
+    if (c == 0) {
+        tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg1));
+        tcg_gen_mov_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1));
+    } else if (c >= 32) {
+        c -= 32;
+        if (right) {
+            if (arith) {
+                tcg_gen_sari_i32(TCGV_LOW(ret), TCGV_HIGH(arg1), c);
+                tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), 31);
+            } else {
+                tcg_gen_shri_i32(TCGV_LOW(ret), TCGV_HIGH(arg1), c);
+                tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+            }
+        } else {
+            tcg_gen_shli_i32(TCGV_HIGH(ret), TCGV_LOW(arg1), c);
+            tcg_gen_movi_i32(TCGV_LOW(ret), 0);
+        }
+    } else {
+        TCGv_i32 t0, t1;
+
+        t0 = tcg_temp_new_i32();
+        t1 = tcg_temp_new_i32();
+        if (right) {
+            tcg_gen_shli_i32(t0, TCGV_HIGH(arg1), 32 - c);
+            if (arith) {
+                tcg_gen_sari_i32(t1, TCGV_HIGH(arg1), c);
+            } else {
+                tcg_gen_shri_i32(t1, TCGV_HIGH(arg1), c);
+            }
+            tcg_gen_shri_i32(TCGV_LOW(ret), TCGV_LOW(arg1), c);
+            tcg_gen_or_i32(TCGV_LOW(ret), TCGV_LOW(ret), t0);
+            tcg_gen_mov_i32(TCGV_HIGH(ret), t1);
+        } else {
+            tcg_gen_shri_i32(t0, TCGV_LOW(arg1), 32 - c);
+            /* Note: ret can be the same as arg1, so we use t1 */
+            tcg_gen_shli_i32(t1, TCGV_LOW(arg1), c);
+            tcg_gen_shli_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), c);
+            tcg_gen_or_i32(TCGV_HIGH(ret), TCGV_HIGH(ret), t0);
+            tcg_gen_mov_i32(TCGV_LOW(ret), t1);
+        }
+        tcg_temp_free_i32(t0);
+        tcg_temp_free_i32(t1);
+    }
+}
+
+void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
+{
+    tcg_gen_shifti_i64(ret, arg1, arg2, 0, 0);
+}
+
+void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
+{
+    tcg_gen_shifti_i64(ret, arg1, arg2, 1, 0);
+}
+
+void tcg_gen_sari_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
+{
+    tcg_gen_shifti_i64(ret, arg1, arg2, 1, 1);
+}
+#else /* TCG_TARGET_REG_SIZE == 64 */
+void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
+{
+    tcg_debug_assert(arg2 < 64);
+    if (arg2 == 0) {
+        tcg_gen_mov_i64(ret, arg1);
+    } else {
+        TCGv_i64 t0 = tcg_const_i64(arg2);
+        tcg_gen_shl_i64(ret, arg1, t0);
+        tcg_temp_free_i64(t0);
+    }
+}
+
+void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
+{
+    tcg_debug_assert(arg2 < 64);
+    if (arg2 == 0) {
+        tcg_gen_mov_i64(ret, arg1);
+    } else {
+        TCGv_i64 t0 = tcg_const_i64(arg2);
+        tcg_gen_shr_i64(ret, arg1, t0);
+        tcg_temp_free_i64(t0);
+    }
+}
+
+void tcg_gen_sari_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
+{
+    tcg_debug_assert(arg2 < 64);
+    if (arg2 == 0) {
+        tcg_gen_mov_i64(ret, arg1);
+    } else {
+        TCGv_i64 t0 = tcg_const_i64(arg2);
+        tcg_gen_sar_i64(ret, arg1, t0);
+        tcg_temp_free_i64(t0);
+    }
+}
+#endif /* TCG_TARGET_REG_SIZE */
+
+void tcg_gen_brcond_i64(TCGCond cond, TCGv_i64 arg1, TCGv_i64 arg2, int label)
+{
+    if (cond == TCG_COND_ALWAYS) {
+        tcg_gen_br(label);
+    } else if (cond != TCG_COND_NEVER) {
+#if TCG_TARGET_REG_BITS == 32
+        tcg_gen_op6ii_i32(INDEX_op_brcond2_i32, TCGV_LOW(arg1),
+                          TCGV_HIGH(arg1), TCGV_LOW(arg2),
+                          TCGV_HIGH(arg2), cond, label);
+#else
+        tcg_gen_op4ii_i64(INDEX_op_brcond_i64, arg1, arg2, cond, label);
+#endif
+    }
+}
+
+void tcg_gen_brcondi_i64(TCGCond cond, TCGv_i64 arg1, int64_t arg2, int label)
+{
+    if (cond == TCG_COND_ALWAYS) {
+        tcg_gen_br(label);
+    } else if (cond != TCG_COND_NEVER) {
+        TCGv_i64 t0 = tcg_const_i64(arg2);
+        tcg_gen_brcond_i64(cond, arg1, t0, label);
+        tcg_temp_free_i64(t0);
+    }
+}
+
+void tcg_gen_setcond_i64(TCGCond cond, TCGv_i64 ret,
+                         TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    if (cond == TCG_COND_ALWAYS) {
+        tcg_gen_movi_i64(ret, 1);
+    } else if (cond == TCG_COND_NEVER) {
+        tcg_gen_movi_i64(ret, 0);
+    } else {
+#if TCG_TARGET_REG_BITS == 32
+        tcg_gen_op6i_i32(INDEX_op_setcond2_i32, TCGV_LOW(ret),
+                         TCGV_LOW(arg1), TCGV_HIGH(arg1),
+                         TCGV_LOW(arg2), TCGV_HIGH(arg2), cond);
+        tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+#else
+        tcg_gen_op4i_i64(INDEX_op_setcond_i64, ret, arg1, arg2, cond);
+#endif
+    }
+}
+
+void tcg_gen_setcondi_i64(TCGCond cond, TCGv_i64 ret,
+                          TCGv_i64 arg1, int64_t arg2)
+{
+    TCGv_i64 t0 = tcg_const_i64(arg2);
+    tcg_gen_setcond_i64(cond, ret, arg1, t0);
+    tcg_temp_free_i64(t0);
+}
+
+void tcg_gen_muli_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
+{
+    TCGv_i64 t0 = tcg_const_i64(arg2);
+    tcg_gen_mul_i64(ret, arg1, t0);
+    tcg_temp_free_i64(t0);
+}
+
+void tcg_gen_div_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    if (TCG_TARGET_HAS_div_i64) {
+        tcg_gen_op3_i64(INDEX_op_div_i64, ret, arg1, arg2);
+    } else if (TCG_TARGET_HAS_div2_i64) {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        tcg_gen_sari_i64(t0, arg1, 63);
+        tcg_gen_op5_i64(INDEX_op_div2_i64, ret, t0, arg1, t0, arg2);
+        tcg_temp_free_i64(t0);
+    } else {
+        gen_helper_div_i64(ret, arg1, arg2);
+    }
+}
+
+void tcg_gen_rem_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    if (TCG_TARGET_HAS_rem_i64) {
+        tcg_gen_op3_i64(INDEX_op_rem_i64, ret, arg1, arg2);
+    } else if (TCG_TARGET_HAS_div_i64) {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        tcg_gen_op3_i64(INDEX_op_div_i64, t0, arg1, arg2);
+        tcg_gen_mul_i64(t0, t0, arg2);
+        tcg_gen_sub_i64(ret, arg1, t0);
+        tcg_temp_free_i64(t0);
+    } else if (TCG_TARGET_HAS_div2_i64) {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        tcg_gen_sari_i64(t0, arg1, 63);
+        tcg_gen_op5_i64(INDEX_op_div2_i64, t0, ret, arg1, t0, arg2);
+        tcg_temp_free_i64(t0);
+    } else {
+        gen_helper_rem_i64(ret, arg1, arg2);
+    }
+}
+
+void tcg_gen_divu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    if (TCG_TARGET_HAS_div_i64) {
+        tcg_gen_op3_i64(INDEX_op_divu_i64, ret, arg1, arg2);
+    } else if (TCG_TARGET_HAS_div2_i64) {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        tcg_gen_movi_i64(t0, 0);
+        tcg_gen_op5_i64(INDEX_op_divu2_i64, ret, t0, arg1, t0, arg2);
+        tcg_temp_free_i64(t0);
+    } else {
+        gen_helper_divu_i64(ret, arg1, arg2);
+    }
+}
+
+void tcg_gen_remu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    if (TCG_TARGET_HAS_rem_i64) {
+        tcg_gen_op3_i64(INDEX_op_remu_i64, ret, arg1, arg2);
+    } else if (TCG_TARGET_HAS_div_i64) {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        tcg_gen_op3_i64(INDEX_op_divu_i64, t0, arg1, arg2);
+        tcg_gen_mul_i64(t0, t0, arg2);
+        tcg_gen_sub_i64(ret, arg1, t0);
+        tcg_temp_free_i64(t0);
+    } else if (TCG_TARGET_HAS_div2_i64) {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        tcg_gen_movi_i64(t0, 0);
+        tcg_gen_op5_i64(INDEX_op_divu2_i64, t0, ret, arg1, t0, arg2);
+        tcg_temp_free_i64(t0);
+    } else {
+        gen_helper_remu_i64(ret, arg1, arg2);
+    }
+}
+
+void tcg_gen_ext8s_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_ext8s_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+#else
+    if (TCG_TARGET_HAS_ext8s_i64) {
+        tcg_gen_op2_i64(INDEX_op_ext8s_i64, ret, arg);
+    } else {
+        tcg_gen_shli_i64(ret, arg, 56);
+        tcg_gen_sari_i64(ret, ret, 56);
+    }
+#endif
+}
+
+void tcg_gen_ext16s_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_ext16s_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+#else
+    if (TCG_TARGET_HAS_ext16s_i64) {
+        tcg_gen_op2_i64(INDEX_op_ext16s_i64, ret, arg);
+    } else {
+        tcg_gen_shli_i64(ret, arg, 48);
+        tcg_gen_sari_i64(ret, ret, 48);
+    }
+#endif
+}
+
+void tcg_gen_ext32s_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+#else
+    if (TCG_TARGET_HAS_ext32s_i64) {
+        tcg_gen_op2_i64(INDEX_op_ext32s_i64, ret, arg);
+    } else {
+        tcg_gen_shli_i64(ret, arg, 32);
+        tcg_gen_sari_i64(ret, ret, 32);
+    }
+#endif
+}
+
+void tcg_gen_ext8u_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_ext8u_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+#else
+    if (TCG_TARGET_HAS_ext8u_i64) {
+        tcg_gen_op2_i64(INDEX_op_ext8u_i64, ret, arg);
+    } else {
+        tcg_gen_andi_i64(ret, arg, 0xffu);
+    }
+#endif
+}
+
+void tcg_gen_ext16u_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_ext16u_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+#else
+    if (TCG_TARGET_HAS_ext16u_i64) {
+        tcg_gen_op2_i64(INDEX_op_ext16u_i64, ret, arg);
+    } else {
+        tcg_gen_andi_i64(ret, arg, 0xffffu);
+    }
+#endif
+}
+
+void tcg_gen_ext32u_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+#else
+    if (TCG_TARGET_HAS_ext32u_i64) {
+        tcg_gen_op2_i64(INDEX_op_ext32u_i64, ret, arg);
+    } else {
+        tcg_gen_andi_i64(ret, arg, 0xffffffffu);
+    }
+#endif
+}
+
+/* Note: we assume the six high bytes are set to zero */
+void tcg_gen_bswap16_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_bswap16_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+#else
+    if (TCG_TARGET_HAS_bswap16_i64) {
+        tcg_gen_op2_i64(INDEX_op_bswap16_i64, ret, arg);
+    } else {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+
+        tcg_gen_ext8u_i64(t0, arg);
+        tcg_gen_shli_i64(t0, t0, 8);
+        tcg_gen_shri_i64(ret, arg, 8);
+        tcg_gen_or_i64(ret, ret, t0);
+        tcg_temp_free_i64(t0);
+    }
+#endif
+}
+
+/* Note: we assume the four high bytes are set to zero */
+void tcg_gen_bswap32_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_bswap32_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+#else
+    if (TCG_TARGET_HAS_bswap32_i64) {
+        tcg_gen_op2_i64(INDEX_op_bswap32_i64, ret, arg);
+    } else {
+        TCGv_i64 t0, t1;
+        t0 = tcg_temp_new_i64();
+        t1 = tcg_temp_new_i64();
+
+        tcg_gen_shli_i64(t0, arg, 24);
+        tcg_gen_ext32u_i64(t0, t0);
+
+        tcg_gen_andi_i64(t1, arg, 0x0000ff00);
+        tcg_gen_shli_i64(t1, t1, 8);
+        tcg_gen_or_i64(t0, t0, t1);
+
+        tcg_gen_shri_i64(t1, arg, 8);
+        tcg_gen_andi_i64(t1, t1, 0x0000ff00);
+        tcg_gen_or_i64(t0, t0, t1);
+
+        tcg_gen_shri_i64(t1, arg, 24);
+        tcg_gen_or_i64(ret, t0, t1);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+    }
+#endif
+}
+
+void tcg_gen_bswap64_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+#if TCG_TARGET_REG_BITS == 32
+    TCGv_i32 t0, t1;
+    t0 = tcg_temp_new_i32();
+    t1 = tcg_temp_new_i32();
+
+    tcg_gen_bswap32_i32(t0, TCGV_LOW(arg));
+    tcg_gen_bswap32_i32(t1, TCGV_HIGH(arg));
+    tcg_gen_mov_i32(TCGV_LOW(ret), t1);
+    tcg_gen_mov_i32(TCGV_HIGH(ret), t0);
+    tcg_temp_free_i32(t0);
+    tcg_temp_free_i32(t1);
+#else
+    if (TCG_TARGET_HAS_bswap64_i64) {
+        tcg_gen_op2_i64(INDEX_op_bswap64_i64, ret, arg);
+    } else {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        TCGv_i64 t1 = tcg_temp_new_i64();
+
+        tcg_gen_shli_i64(t0, arg, 56);
+
+        tcg_gen_andi_i64(t1, arg, 0x0000ff00);
+        tcg_gen_shli_i64(t1, t1, 40);
+        tcg_gen_or_i64(t0, t0, t1);
+
+        tcg_gen_andi_i64(t1, arg, 0x00ff0000);
+        tcg_gen_shli_i64(t1, t1, 24);
+        tcg_gen_or_i64(t0, t0, t1);
+
+        tcg_gen_andi_i64(t1, arg, 0xff000000);
+        tcg_gen_shli_i64(t1, t1, 8);
+        tcg_gen_or_i64(t0, t0, t1);
+
+        tcg_gen_shri_i64(t1, arg, 8);
+        tcg_gen_andi_i64(t1, t1, 0xff000000);
+        tcg_gen_or_i64(t0, t0, t1);
+
+        tcg_gen_shri_i64(t1, arg, 24);
+        tcg_gen_andi_i64(t1, t1, 0x00ff0000);
+        tcg_gen_or_i64(t0, t0, t1);
+
+        tcg_gen_shri_i64(t1, arg, 40);
+        tcg_gen_andi_i64(t1, t1, 0x0000ff00);
+        tcg_gen_or_i64(t0, t0, t1);
+
+        tcg_gen_shri_i64(t1, arg, 56);
+        tcg_gen_or_i64(ret, t0, t1);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+    }
+#endif
+}
+
+void tcg_gen_not_i64(TCGv_i64 ret, TCGv_i64 arg)
+{
+#if TCG_TARGET_REG_BITS == 64
+    if (TCG_TARGET_HAS_not_i64) {
+        tcg_gen_op2_i64(INDEX_op_not_i64, ret, arg);
+    } else {
+        tcg_gen_xori_i64(ret, arg, -1);
+    }
+#else
+    tcg_gen_not_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+    tcg_gen_not_i32(TCGV_HIGH(ret), TCGV_HIGH(arg));
+#endif
+}
+
+void tcg_gen_andc_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+#if TCG_TARGET_REG_BITS == 64
+    if (TCG_TARGET_HAS_andc_i64) {
+        tcg_gen_op3_i64(INDEX_op_andc_i64, ret, arg1, arg2);
+    } else {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        tcg_gen_not_i64(t0, arg2);
+        tcg_gen_and_i64(ret, arg1, t0);
+        tcg_temp_free_i64(t0);
+    }
+#else
+    tcg_gen_andc_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
+    tcg_gen_andc_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
+#endif
+}
+
+void tcg_gen_eqv_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+#if TCG_TARGET_REG_BITS == 64
+    if (TCG_TARGET_HAS_eqv_i64) {
+        tcg_gen_op3_i64(INDEX_op_eqv_i64, ret, arg1, arg2);
+    } else {
+        tcg_gen_xor_i64(ret, arg1, arg2);
+        tcg_gen_not_i64(ret, ret);
+    }
+#else
+    tcg_gen_eqv_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
+    tcg_gen_eqv_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
+#endif
+}
+
+void tcg_gen_nand_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+#if TCG_TARGET_REG_BITS == 64
+    if (TCG_TARGET_HAS_nand_i64) {
+        tcg_gen_op3_i64(INDEX_op_nand_i64, ret, arg1, arg2);
+    } else {
+        tcg_gen_and_i64(ret, arg1, arg2);
+        tcg_gen_not_i64(ret, ret);
+    }
+#else
+    tcg_gen_nand_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
+    tcg_gen_nand_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
+#endif
+}
+
+void tcg_gen_nor_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+#if TCG_TARGET_REG_BITS == 64
+    if (TCG_TARGET_HAS_nor_i64) {
+        tcg_gen_op3_i64(INDEX_op_nor_i64, ret, arg1, arg2);
+    } else {
+        tcg_gen_or_i64(ret, arg1, arg2);
+        tcg_gen_not_i64(ret, ret);
+    }
+#else
+    tcg_gen_nor_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
+    tcg_gen_nor_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
+#endif
+}
+
+void tcg_gen_orc_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+#if TCG_TARGET_REG_BITS == 64
+    if (TCG_TARGET_HAS_orc_i64) {
+        tcg_gen_op3_i64(INDEX_op_orc_i64, ret, arg1, arg2);
+    } else {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        tcg_gen_not_i64(t0, arg2);
+        tcg_gen_or_i64(ret, arg1, t0);
+        tcg_temp_free_i64(t0);
+    }
+#else
+    tcg_gen_orc_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
+    tcg_gen_orc_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
+#endif
+}
+
+void tcg_gen_rotl_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    if (TCG_TARGET_HAS_rot_i64) {
+        tcg_gen_op3_i64(INDEX_op_rotl_i64, ret, arg1, arg2);
+    } else {
+        TCGv_i64 t0, t1;
+        t0 = tcg_temp_new_i64();
+        t1 = tcg_temp_new_i64();
+        tcg_gen_shl_i64(t0, arg1, arg2);
+        tcg_gen_subfi_i64(t1, 64, arg2);
+        tcg_gen_shr_i64(t1, arg1, t1);
+        tcg_gen_or_i64(ret, t0, t1);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+    }
+}
+
+void tcg_gen_rotli_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
+{
+    tcg_debug_assert(arg2 < 64);
+    /* some cases can be optimized here */
+    if (arg2 == 0) {
+        tcg_gen_mov_i64(ret, arg1);
+    } else if (TCG_TARGET_HAS_rot_i64) {
+        TCGv_i64 t0 = tcg_const_i64(arg2);
+        tcg_gen_rotl_i64(ret, arg1, t0);
+        tcg_temp_free_i64(t0);
+    } else {
+        TCGv_i64 t0, t1;
+        t0 = tcg_temp_new_i64();
+        t1 = tcg_temp_new_i64();
+        tcg_gen_shli_i64(t0, arg1, arg2);
+        tcg_gen_shri_i64(t1, arg1, 64 - arg2);
+        tcg_gen_or_i64(ret, t0, t1);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+    }
+}
+
+void tcg_gen_rotr_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    if (TCG_TARGET_HAS_rot_i64) {
+        tcg_gen_op3_i64(INDEX_op_rotr_i64, ret, arg1, arg2);
+    } else {
+        TCGv_i64 t0, t1;
+        t0 = tcg_temp_new_i64();
+        t1 = tcg_temp_new_i64();
+        tcg_gen_shr_i64(t0, arg1, arg2);
+        tcg_gen_subfi_i64(t1, 64, arg2);
+        tcg_gen_shl_i64(t1, arg1, t1);
+        tcg_gen_or_i64(ret, t0, t1);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+    }
+}
+
+void tcg_gen_rotri_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
+{
+    tcg_debug_assert(arg2 < 64);
+    /* some cases can be optimized here */
+    if (arg2 == 0) {
+        tcg_gen_mov_i64(ret, arg1);
+    } else {
+        tcg_gen_rotli_i64(ret, arg1, 64 - arg2);
+    }
+}
+
+void tcg_gen_deposit_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2,
+                         unsigned int ofs, unsigned int len)
+{
+    uint64_t mask;
+    TCGv_i64 t1;
+
+    tcg_debug_assert(ofs < 64);
+    tcg_debug_assert(len <= 64);
+    tcg_debug_assert(ofs + len <= 64);
+
+    if (ofs == 0 && len == 64) {
+        tcg_gen_mov_i64(ret, arg2);
+        return;
+    }
+    if (TCG_TARGET_HAS_deposit_i64 && TCG_TARGET_deposit_i64_valid(ofs, len)) {
+        tcg_gen_op5ii_i64(INDEX_op_deposit_i64, ret, arg1, arg2, ofs, len);
+        return;
+    }
+
+#if TCG_TARGET_REG_BITS == 32
+    if (ofs >= 32) {
+        tcg_gen_deposit_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1),
+                            TCGV_LOW(arg2), ofs - 32, len);
+        tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg1));
+        return;
+    }
+    if (ofs + len <= 32) {
+        tcg_gen_deposit_i32(TCGV_LOW(ret), TCGV_LOW(arg1),
+                            TCGV_LOW(arg2), ofs, len);
+        tcg_gen_mov_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1));
+        return;
+    }
+#endif
+
+    mask = (1ull << len) - 1;
+    t1 = tcg_temp_new_i64();
+
+    if (ofs + len < 64) {
+        tcg_gen_andi_i64(t1, arg2, mask);
+        tcg_gen_shli_i64(t1, t1, ofs);
+    } else {
+        tcg_gen_shli_i64(t1, arg2, ofs);
+    }
+    tcg_gen_andi_i64(ret, arg1, ~(mask << ofs));
+    tcg_gen_or_i64(ret, ret, t1);
+
+    tcg_temp_free_i64(t1);
+}
+
+void tcg_gen_movcond_i64(TCGCond cond, TCGv_i64 ret, TCGv_i64 c1,
+                         TCGv_i64 c2, TCGv_i64 v1, TCGv_i64 v2)
+{
+#if TCG_TARGET_REG_BITS == 32
+    TCGv_i32 t0 = tcg_temp_new_i32();
+    TCGv_i32 t1 = tcg_temp_new_i32();
+    tcg_gen_op6i_i32(INDEX_op_setcond2_i32, t0,
+                     TCGV_LOW(c1), TCGV_HIGH(c1),
+                     TCGV_LOW(c2), TCGV_HIGH(c2), cond);
+
+    if (TCG_TARGET_HAS_movcond_i32) {
+        tcg_gen_movi_i32(t1, 0);
+        tcg_gen_movcond_i32(TCG_COND_NE, TCGV_LOW(ret), t0, t1,
+                            TCGV_LOW(v1), TCGV_LOW(v2));
+        tcg_gen_movcond_i32(TCG_COND_NE, TCGV_HIGH(ret), t0, t1,
+                            TCGV_HIGH(v1), TCGV_HIGH(v2));
+    } else {
+        tcg_gen_neg_i32(t0, t0);
+
+        tcg_gen_and_i32(t1, TCGV_LOW(v1), t0);
+        tcg_gen_andc_i32(TCGV_LOW(ret), TCGV_LOW(v2), t0);
+        tcg_gen_or_i32(TCGV_LOW(ret), TCGV_LOW(ret), t1);
+
+        tcg_gen_and_i32(t1, TCGV_HIGH(v1), t0);
+        tcg_gen_andc_i32(TCGV_HIGH(ret), TCGV_HIGH(v2), t0);
+        tcg_gen_or_i32(TCGV_HIGH(ret), TCGV_HIGH(ret), t1);
+    }
+    tcg_temp_free_i32(t0);
+    tcg_temp_free_i32(t1);
+#else
+    if (TCG_TARGET_HAS_movcond_i64) {
+        tcg_gen_op6i_i64(INDEX_op_movcond_i64, ret, c1, c2, v1, v2, cond);
+    } else {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        TCGv_i64 t1 = tcg_temp_new_i64();
+        tcg_gen_setcond_i64(cond, t0, c1, c2);
+        tcg_gen_neg_i64(t0, t0);
+        tcg_gen_and_i64(t1, v1, t0);
+        tcg_gen_andc_i64(ret, v2, t0);
+        tcg_gen_or_i64(ret, ret, t1);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+    }
+#endif
+}
+
+void tcg_gen_add2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 al,
+                      TCGv_i64 ah, TCGv_i64 bl, TCGv_i64 bh)
+{
+    if (TCG_TARGET_HAS_add2_i64) {
+        tcg_gen_op6_i64(INDEX_op_add2_i64, rl, rh, al, ah, bl, bh);
+        /* Allow the optimizer room to replace add2 with two moves.  */
+        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
+    } else {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        TCGv_i64 t1 = tcg_temp_new_i64();
+        tcg_gen_add_i64(t0, al, bl);
+        tcg_gen_setcond_i64(TCG_COND_LTU, t1, t0, al);
+        tcg_gen_add_i64(rh, ah, bh);
+        tcg_gen_add_i64(rh, rh, t1);
+        tcg_gen_mov_i64(rl, t0);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+    }
+}
+
+void tcg_gen_sub2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 al,
+                      TCGv_i64 ah, TCGv_i64 bl, TCGv_i64 bh)
+{
+    if (TCG_TARGET_HAS_sub2_i64) {
+        tcg_gen_op6_i64(INDEX_op_sub2_i64, rl, rh, al, ah, bl, bh);
+        /* Allow the optimizer room to replace sub2 with two moves.  */
+        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
+    } else {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        TCGv_i64 t1 = tcg_temp_new_i64();
+        tcg_gen_sub_i64(t0, al, bl);
+        tcg_gen_setcond_i64(TCG_COND_LTU, t1, al, bl);
+        tcg_gen_sub_i64(rh, ah, bh);
+        tcg_gen_sub_i64(rh, rh, t1);
+        tcg_gen_mov_i64(rl, t0);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+    }
+}
+
+void tcg_gen_mulu2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    if (TCG_TARGET_HAS_mulu2_i64) {
+        tcg_gen_op4_i64(INDEX_op_mulu2_i64, rl, rh, arg1, arg2);
+        /* Allow the optimizer room to replace mulu2 with two moves.  */
+        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
+    } else if (TCG_TARGET_HAS_muluh_i64) {
+        TCGv_i64 t = tcg_temp_new_i64();
+        tcg_gen_op3_i64(INDEX_op_mul_i64, t, arg1, arg2);
+        tcg_gen_op3_i64(INDEX_op_muluh_i64, rh, arg1, arg2);
+        tcg_gen_mov_i64(rl, t);
+        tcg_temp_free_i64(t);
+    } else {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        tcg_gen_mul_i64(t0, arg1, arg2);
+        gen_helper_muluh_i64(rh, arg1, arg2);
+        tcg_gen_mov_i64(rl, t0);
+        tcg_temp_free_i64(t0);
+    }
+}
+
+void tcg_gen_muls2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 arg1, TCGv_i64 arg2)
+{
+    if (TCG_TARGET_HAS_muls2_i64) {
+        tcg_gen_op4_i64(INDEX_op_muls2_i64, rl, rh, arg1, arg2);
+        /* Allow the optimizer room to replace muls2 with two moves.  */
+        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
+    } else if (TCG_TARGET_HAS_mulsh_i64) {
+        TCGv_i64 t = tcg_temp_new_i64();
+        tcg_gen_op3_i64(INDEX_op_mul_i64, t, arg1, arg2);
+        tcg_gen_op3_i64(INDEX_op_mulsh_i64, rh, arg1, arg2);
+        tcg_gen_mov_i64(rl, t);
+        tcg_temp_free_i64(t);
+    } else if (TCG_TARGET_HAS_mulu2_i64 || TCG_TARGET_HAS_muluh_i64) {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        TCGv_i64 t1 = tcg_temp_new_i64();
+        TCGv_i64 t2 = tcg_temp_new_i64();
+        TCGv_i64 t3 = tcg_temp_new_i64();
+        tcg_gen_mulu2_i64(t0, t1, arg1, arg2);
+        /* Adjust for negative inputs.  */
+        tcg_gen_sari_i64(t2, arg1, 63);
+        tcg_gen_sari_i64(t3, arg2, 63);
+        tcg_gen_and_i64(t2, t2, arg2);
+        tcg_gen_and_i64(t3, t3, arg1);
+        tcg_gen_sub_i64(rh, t1, t2);
+        tcg_gen_sub_i64(rh, rh, t3);
+        tcg_gen_mov_i64(rl, t0);
+        tcg_temp_free_i64(t0);
+        tcg_temp_free_i64(t1);
+        tcg_temp_free_i64(t2);
+        tcg_temp_free_i64(t3);
+    } else {
+        TCGv_i64 t0 = tcg_temp_new_i64();
+        tcg_gen_mul_i64(t0, arg1, arg2);
+        gen_helper_mulsh_i64(rh, arg1, arg2);
+        tcg_gen_mov_i64(rl, t0);
+        tcg_temp_free_i64(t0);
+    }
+}
+
+/* Size changing operations.  */
+
+void tcg_gen_trunc_shr_i64_i32(TCGv_i32 ret, TCGv_i64 arg, unsigned count)
+{
+    tcg_debug_assert(count < 64);
+#if TCG_TARGET_REG_BITS == 32
+    if (count >= 32) {
+        tcg_gen_shri_i32(ret, TCGV_HIGH(arg), count - 32);
+    } else if (count == 0) {
+        tcg_gen_mov_i32(ret, TCGV_LOW(arg));
+    } else {
+        TCGv_i64 t = tcg_temp_new_i64();
+        tcg_gen_shri_i64(t, arg, count);
+        tcg_gen_mov_i32(ret, TCGV_LOW(t));
+        tcg_temp_free_i64(t);
+    }
+#else
+    if (TCG_TARGET_HAS_trunc_shr_i32) {
+        tcg_gen_op3i_i32(INDEX_op_trunc_shr_i32, ret,
+                         MAKE_TCGV_I32(GET_TCGV_I64(arg)), count);
+    } else if (count == 0) {
+        tcg_gen_mov_i32(ret, MAKE_TCGV_I32(GET_TCGV_I64(arg)));
+    } else {
+        TCGv_i64 t = tcg_temp_new_i64();
+        tcg_gen_shri_i64(t, arg, count);
+        tcg_gen_mov_i32(ret, MAKE_TCGV_I32(GET_TCGV_I64(t)));
+        tcg_temp_free_i64(t);
+    }
+#endif
+}
+
+void tcg_gen_extu_i32_i64(TCGv_i64 ret, TCGv_i32 arg)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_mov_i32(TCGV_LOW(ret), arg);
+    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+#else
+    /* Note: we assume the target supports move between
+       32 and 64 bit registers.  */
+    tcg_gen_ext32u_i64(ret, MAKE_TCGV_I64(GET_TCGV_I32(arg)));
+#endif
+}
+
+void tcg_gen_ext_i32_i64(TCGv_i64 ret, TCGv_i32 arg)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_mov_i32(TCGV_LOW(ret), arg);
+    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+#else
+    /* Note: we assume the target supports move between
+       32 and 64 bit registers.  */
+    tcg_gen_ext32s_i64(ret, MAKE_TCGV_I64(GET_TCGV_I32(arg)));
+#endif
+}
+
+void tcg_gen_concat_i32_i64(TCGv_i64 dest, TCGv_i32 low, TCGv_i32 high)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_mov_i32(TCGV_LOW(dest), low);
+    tcg_gen_mov_i32(TCGV_HIGH(dest), high);
+#else
+    TCGv_i64 tmp = tcg_temp_new_i64();
+    /* These extensions are only needed for type correctness.
+       We may be able to do better given target specific information.  */
+    tcg_gen_extu_i32_i64(tmp, high);
+    tcg_gen_extu_i32_i64(dest, low);
+    /* If deposit is available, use it.  Otherwise use the extra
+       knowledge that we have of the zero-extensions above.  */
+    if (TCG_TARGET_HAS_deposit_i64 && TCG_TARGET_deposit_i64_valid(32, 32)) {
+        tcg_gen_deposit_i64(dest, dest, tmp, 32, 32);
+    } else {
+        tcg_gen_shli_i64(tmp, tmp, 32);
+        tcg_gen_or_i64(dest, dest, tmp);
+    }
+    tcg_temp_free_i64(tmp);
+#endif
+}
+
+void tcg_gen_extr_i64_i32(TCGv_i32 lo, TCGv_i32 hi, TCGv_i64 arg)
+{
+#if TCG_TARGET_REG_BITS == 32
+    tcg_gen_mov_i32(lo, TCGV_LOW(arg));
+    tcg_gen_mov_i32(hi, TCGV_HIGH(arg));
+#else
+    tcg_gen_trunc_shr_i64_i32(lo, arg, 0);
+    tcg_gen_trunc_shr_i64_i32(hi, arg, 32);
+#endif
+}
+
+void tcg_gen_extr32_i64(TCGv_i64 lo, TCGv_i64 hi, TCGv_i64 arg)
+{
+    tcg_gen_ext32u_i64(lo, arg);
+    tcg_gen_shri_i64(hi, arg, 32);
+}
+
+/* QEMU specific operations.  */
+
+void tcg_gen_goto_tb(unsigned idx)
+{
+    /* We only support two chained exits.  */
+    tcg_debug_assert(idx <= 1);
+#ifdef CONFIG_DEBUG_TCG
+    /* Verify that we havn't seen this numbered exit before.  */
+    tcg_debug_assert((tcg_ctx.goto_tb_issue_mask & (1 << idx)) == 0);
+    tcg_ctx.goto_tb_issue_mask |= 1 << idx;
+#endif
+    tcg_gen_op1i(INDEX_op_goto_tb, idx);
+}
+
+static inline TCGMemOp tcg_canonicalize_memop(TCGMemOp op, bool is64, bool st)
+{
+    switch (op & MO_SIZE) {
+    case MO_8:
+        op &= ~MO_BSWAP;
+        break;
+    case MO_16:
+        break;
+    case MO_32:
+        if (!is64) {
+            op &= ~MO_SIGN;
+        }
+        break;
+    case MO_64:
+        if (!is64) {
+            tcg_abort();
+        }
+        break;
+    }
+    if (st) {
+        op &= ~MO_SIGN;
+    }
+    return op;
+}
+
+static inline void tcg_add_param_i32(TCGv_i32 val)
+{
+    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(val);
+}
+
+static inline void tcg_add_param_i64(TCGv_i64 val)
+{
+#if TCG_TARGET_REG_BITS == 32
+    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(TCGV_LOW(val));
+    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(TCGV_HIGH(val));
+#else
+    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(val);
+#endif
+}
+
+#if TARGET_LONG_BITS == 32
+# define tcg_add_param_tl  tcg_add_param_i32
+#else
+# define tcg_add_param_tl  tcg_add_param_i64
+#endif
+
+void tcg_gen_qemu_ld_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
+{
+    memop = tcg_canonicalize_memop(memop, 0, 0);
+
+    *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_ld_i32;
+    tcg_add_param_i32(val);
+    tcg_add_param_tl(addr);
+    *tcg_ctx.gen_opparam_ptr++ = memop;
+    *tcg_ctx.gen_opparam_ptr++ = idx;
+}
+
+void tcg_gen_qemu_st_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
+{
+    memop = tcg_canonicalize_memop(memop, 0, 1);
+
+    *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_st_i32;
+    tcg_add_param_i32(val);
+    tcg_add_param_tl(addr);
+    *tcg_ctx.gen_opparam_ptr++ = memop;
+    *tcg_ctx.gen_opparam_ptr++ = idx;
+}
+
+void tcg_gen_qemu_ld_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
+{
+    memop = tcg_canonicalize_memop(memop, 1, 0);
+
+#if TCG_TARGET_REG_BITS == 32
+    if ((memop & MO_SIZE) < MO_64) {
+        tcg_gen_qemu_ld_i32(TCGV_LOW(val), addr, idx, memop);
+        if (memop & MO_SIGN) {
+            tcg_gen_sari_i32(TCGV_HIGH(val), TCGV_LOW(val), 31);
+        } else {
+            tcg_gen_movi_i32(TCGV_HIGH(val), 0);
+        }
+        return;
+    }
+#endif
+
+    *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_ld_i64;
+    tcg_add_param_i64(val);
+    tcg_add_param_tl(addr);
+    *tcg_ctx.gen_opparam_ptr++ = memop;
+    *tcg_ctx.gen_opparam_ptr++ = idx;
+}
+
+void tcg_gen_qemu_st_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
+{
+    memop = tcg_canonicalize_memop(memop, 1, 1);
+
+#if TCG_TARGET_REG_BITS == 32
+    if ((memop & MO_SIZE) < MO_64) {
+        tcg_gen_qemu_st_i32(TCGV_LOW(val), addr, idx, memop);
+        return;
+    }
+#endif
+
+    *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_st_i64;
+    tcg_add_param_i64(val);
+    tcg_add_param_tl(addr);
+    *tcg_ctx.gen_opparam_ptr++ = memop;
+    *tcg_ctx.gen_opparam_ptr++ = idx;
+}
diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index 019dd9b..eacfd8a 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -21,359 +21,310 @@
  * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
  * THE SOFTWARE.
  */
+
 #include "tcg.h"
 #include "exec/helper-proto.h"
 #include "exec/helper-gen.h"
 
-int gen_new_label(void);
+/* Basic output routines.  Not for general consumption.  */
 
-static inline void tcg_gen_op0(TCGOpcode opc)
-{
-    *tcg_ctx.gen_opc_ptr++ = opc;
-}
+void tcg_gen_op0(TCGContext *, TCGOpcode);
+void tcg_gen_op1(TCGContext *, TCGOpcode, TCGArg);
+void tcg_gen_op2(TCGContext *, TCGOpcode, TCGArg, TCGArg);
+void tcg_gen_op3(TCGContext *, TCGOpcode, TCGArg, TCGArg, TCGArg);
+void tcg_gen_op4(TCGContext *, TCGOpcode, TCGArg, TCGArg, TCGArg, TCGArg);
+void tcg_gen_op5(TCGContext *, TCGOpcode, TCGArg, TCGArg, TCGArg,
+                 TCGArg, TCGArg);
+void tcg_gen_op6(TCGContext *, TCGOpcode, TCGArg, TCGArg, TCGArg,
+                 TCGArg, TCGArg, TCGArg);
 
-static inline void tcg_gen_op1_i32(TCGOpcode opc, TCGv_i32 arg1)
+
+static inline void tcg_gen_op1_i32(TCGOpcode opc, TCGv_i32 a1)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
+    tcg_gen_op1(&tcg_ctx, opc, GET_TCGV_I32(a1));
 }
 
-static inline void tcg_gen_op1_i64(TCGOpcode opc, TCGv_i64 arg1)
+static inline void tcg_gen_op1_i64(TCGOpcode opc, TCGv_i64 a1)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
+    tcg_gen_op1(&tcg_ctx, opc, GET_TCGV_I64(a1));
 }
 
-static inline void tcg_gen_op1i(TCGOpcode opc, TCGArg arg1)
+static inline void tcg_gen_op1i(TCGOpcode opc, TCGArg a1)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = arg1;
+    tcg_gen_op1(&tcg_ctx, opc, a1);
 }
 
-static inline void tcg_gen_op2_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2)
+static inline void tcg_gen_op2_i32(TCGOpcode opc, TCGv_i32 a1, TCGv_i32 a2)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
+    tcg_gen_op2(&tcg_ctx, opc, GET_TCGV_I32(a1), GET_TCGV_I32(a2));
 }
 
-static inline void tcg_gen_op2_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2)
+static inline void tcg_gen_op2_i64(TCGOpcode opc, TCGv_i64 a1, TCGv_i64 a2)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
+    tcg_gen_op2(&tcg_ctx, opc, GET_TCGV_I64(a1), GET_TCGV_I64(a2));
 }
 
-static inline void tcg_gen_op2i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGArg arg2)
+static inline void tcg_gen_op2i_i32(TCGOpcode opc, TCGv_i32 a1, TCGArg a2)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = arg2;
+    tcg_gen_op2(&tcg_ctx, opc, GET_TCGV_I32(a1), a2);
 }
 
-static inline void tcg_gen_op2i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGArg arg2)
+static inline void tcg_gen_op2i_i64(TCGOpcode opc, TCGv_i64 a1, TCGArg a2)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = arg2;
+    tcg_gen_op2(&tcg_ctx, opc, GET_TCGV_I64(a1), a2);
 }
 
-static inline void tcg_gen_op2ii(TCGOpcode opc, TCGArg arg1, TCGArg arg2)
+static inline void tcg_gen_op2ii(TCGOpcode opc, TCGArg a1, TCGArg a2)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = arg1;
-    *tcg_ctx.gen_opparam_ptr++ = arg2;
+    tcg_gen_op2(&tcg_ctx, opc, a1, a2);
 }
 
-static inline void tcg_gen_op3_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2,
-                                   TCGv_i32 arg3)
+static inline void tcg_gen_op3_i32(TCGOpcode opc, TCGv_i32 a1,
+                                   TCGv_i32 a2, TCGv_i32 a3)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3);
+    tcg_gen_op3(&tcg_ctx, opc, GET_TCGV_I32(a1),
+                GET_TCGV_I32(a2), GET_TCGV_I32(a3));
 }
 
-static inline void tcg_gen_op3_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2,
-                                   TCGv_i64 arg3)
+static inline void tcg_gen_op3_i64(TCGOpcode opc, TCGv_i64 a1,
+                                   TCGv_i64 a2, TCGv_i64 a3)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3);
+    tcg_gen_op3(&tcg_ctx, opc, GET_TCGV_I64(a1),
+                GET_TCGV_I64(a2), GET_TCGV_I64(a3));
 }
 
-static inline void tcg_gen_op3i_i32(TCGOpcode opc, TCGv_i32 arg1,
-                                    TCGv_i32 arg2, TCGArg arg3)
+static inline void tcg_gen_op3i_i32(TCGOpcode opc, TCGv_i32 a1,
+                                    TCGv_i32 a2, TCGArg a3)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = arg3;
+    tcg_gen_op3(&tcg_ctx, opc, GET_TCGV_I32(a1), GET_TCGV_I32(a2), a3);
 }
 
-static inline void tcg_gen_op3i_i64(TCGOpcode opc, TCGv_i64 arg1,
-                                    TCGv_i64 arg2, TCGArg arg3)
+static inline void tcg_gen_op3i_i64(TCGOpcode opc, TCGv_i64 a1,
+                                    TCGv_i64 a2, TCGArg a3)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = arg3;
+    tcg_gen_op3(&tcg_ctx, opc, GET_TCGV_I64(a1), GET_TCGV_I64(a2), a3);
 }
 
 static inline void tcg_gen_ldst_op_i32(TCGOpcode opc, TCGv_i32 val,
                                        TCGv_ptr base, TCGArg offset)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(val);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_PTR(base);
-    *tcg_ctx.gen_opparam_ptr++ = offset;
+    tcg_gen_op3(&tcg_ctx, opc, GET_TCGV_I32(val), GET_TCGV_PTR(base), offset);
 }
 
 static inline void tcg_gen_ldst_op_i64(TCGOpcode opc, TCGv_i64 val,
                                        TCGv_ptr base, TCGArg offset)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(val);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_PTR(base);
-    *tcg_ctx.gen_opparam_ptr++ = offset;
+    tcg_gen_op3(&tcg_ctx, opc, GET_TCGV_I64(val), GET_TCGV_PTR(base), offset);
 }
 
-static inline void tcg_gen_op4_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2,
-                                   TCGv_i32 arg3, TCGv_i32 arg4)
+static inline void tcg_gen_op4_i32(TCGOpcode opc, TCGv_i32 a1, TCGv_i32 a2,
+                                   TCGv_i32 a3, TCGv_i32 a4)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg4);
+    tcg_gen_op4(&tcg_ctx, opc, GET_TCGV_I32(a1), GET_TCGV_I32(a2),
+                GET_TCGV_I32(a3), GET_TCGV_I32(a4));
 }
 
-static inline void tcg_gen_op4_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2,
-                                   TCGv_i64 arg3, TCGv_i64 arg4)
+static inline void tcg_gen_op4_i64(TCGOpcode opc, TCGv_i64 a1, TCGv_i64 a2,
+                                   TCGv_i64 a3, TCGv_i64 a4)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg4);
+    tcg_gen_op4(&tcg_ctx, opc, GET_TCGV_I64(a1), GET_TCGV_I64(a2),
+                GET_TCGV_I64(a3), GET_TCGV_I64(a4));
 }
 
-static inline void tcg_gen_op4i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2,
-                                    TCGv_i32 arg3, TCGArg arg4)
+static inline void tcg_gen_op4i_i32(TCGOpcode opc, TCGv_i32 a1, TCGv_i32 a2,
+                                    TCGv_i32 a3, TCGArg a4)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = arg4;
+    tcg_gen_op4(&tcg_ctx, opc, GET_TCGV_I32(a1), GET_TCGV_I32(a2),
+                GET_TCGV_I32(a3), a4);
 }
 
-static inline void tcg_gen_op4i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2,
-                                    TCGv_i64 arg3, TCGArg arg4)
+static inline void tcg_gen_op4i_i64(TCGOpcode opc, TCGv_i64 a1, TCGv_i64 a2,
+                                    TCGv_i64 a3, TCGArg a4)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = arg4;
+    tcg_gen_op4(&tcg_ctx, opc, GET_TCGV_I64(a1), GET_TCGV_I64(a2),
+                GET_TCGV_I64(a3), a4);
 }
 
-static inline void tcg_gen_op4ii_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2,
-                                     TCGArg arg3, TCGArg arg4)
+static inline void tcg_gen_op4ii_i32(TCGOpcode opc, TCGv_i32 a1, TCGv_i32 a2,
+                                     TCGArg a3, TCGArg a4)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = arg3;
-    *tcg_ctx.gen_opparam_ptr++ = arg4;
+    tcg_gen_op4(&tcg_ctx, opc, GET_TCGV_I32(a1), GET_TCGV_I32(a2), a3, a4);
 }
 
-static inline void tcg_gen_op4ii_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2,
-                                     TCGArg arg3, TCGArg arg4)
+static inline void tcg_gen_op4ii_i64(TCGOpcode opc, TCGv_i64 a1, TCGv_i64 a2,
+                                     TCGArg a3, TCGArg a4)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = arg3;
-    *tcg_ctx.gen_opparam_ptr++ = arg4;
+    tcg_gen_op4(&tcg_ctx, opc, GET_TCGV_I64(a1), GET_TCGV_I64(a2), a3, a4);
 }
 
-static inline void tcg_gen_op5_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2,
-                                   TCGv_i32 arg3, TCGv_i32 arg4, TCGv_i32 arg5)
+static inline void tcg_gen_op5_i32(TCGOpcode opc, TCGv_i32 a1, TCGv_i32 a2,
+                                   TCGv_i32 a3, TCGv_i32 a4, TCGv_i32 a5)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg4);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg5);
+    tcg_gen_op5(&tcg_ctx, opc, GET_TCGV_I32(a1), GET_TCGV_I32(a2),
+                GET_TCGV_I32(a3), GET_TCGV_I32(a4), GET_TCGV_I32(a5));
 }
 
-static inline void tcg_gen_op5_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2,
-                                   TCGv_i64 arg3, TCGv_i64 arg4, TCGv_i64 arg5)
+static inline void tcg_gen_op5_i64(TCGOpcode opc, TCGv_i64 a1, TCGv_i64 a2,
+                                   TCGv_i64 a3, TCGv_i64 a4, TCGv_i64 a5)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg4);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg5);
+    tcg_gen_op5(&tcg_ctx, opc, GET_TCGV_I64(a1), GET_TCGV_I64(a2),
+                GET_TCGV_I64(a3), GET_TCGV_I64(a4), GET_TCGV_I64(a5));
 }
 
-static inline void tcg_gen_op5i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2,
-                                    TCGv_i32 arg3, TCGv_i32 arg4, TCGArg arg5)
+static inline void tcg_gen_op5i_i32(TCGOpcode opc, TCGv_i32 a1, TCGv_i32 a2,
+                                    TCGv_i32 a3, TCGv_i32 a4, TCGArg a5)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg4);
-    *tcg_ctx.gen_opparam_ptr++ = arg5;
+    tcg_gen_op5(&tcg_ctx, opc, GET_TCGV_I32(a1), GET_TCGV_I32(a2),
+                GET_TCGV_I32(a3), GET_TCGV_I32(a4), a5);
 }
 
-static inline void tcg_gen_op5i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2,
-                                    TCGv_i64 arg3, TCGv_i64 arg4, TCGArg arg5)
+static inline void tcg_gen_op5i_i64(TCGOpcode opc, TCGv_i64 a1, TCGv_i64 a2,
+                                    TCGv_i64 a3, TCGv_i64 a4, TCGArg a5)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg4);
-    *tcg_ctx.gen_opparam_ptr++ = arg5;
+    tcg_gen_op5(&tcg_ctx, opc, GET_TCGV_I64(a1), GET_TCGV_I64(a2),
+                GET_TCGV_I64(a3), GET_TCGV_I64(a4), a5);
 }
 
-static inline void tcg_gen_op5ii_i32(TCGOpcode opc, TCGv_i32 arg1,
-                                     TCGv_i32 arg2, TCGv_i32 arg3,
-                                     TCGArg arg4, TCGArg arg5)
+static inline void tcg_gen_op5ii_i32(TCGOpcode opc, TCGv_i32 a1, TCGv_i32 a2,
+                                     TCGv_i32 a3, TCGArg a4, TCGArg a5)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = arg4;
-    *tcg_ctx.gen_opparam_ptr++ = arg5;
+    tcg_gen_op5(&tcg_ctx, opc, GET_TCGV_I32(a1), GET_TCGV_I32(a2),
+                GET_TCGV_I32(a3), a4, a5);
 }
 
-static inline void tcg_gen_op5ii_i64(TCGOpcode opc, TCGv_i64 arg1,
-                                     TCGv_i64 arg2, TCGv_i64 arg3,
-                                     TCGArg arg4, TCGArg arg5)
+static inline void tcg_gen_op5ii_i64(TCGOpcode opc, TCGv_i64 a1, TCGv_i64 a2,
+                                     TCGv_i64 a3, TCGArg a4, TCGArg a5)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = arg4;
-    *tcg_ctx.gen_opparam_ptr++ = arg5;
+    tcg_gen_op5(&tcg_ctx, opc, GET_TCGV_I64(a1), GET_TCGV_I64(a2),
+                GET_TCGV_I64(a3), a4, a5);
 }
 
-static inline void tcg_gen_op6_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2,
-                                   TCGv_i32 arg3, TCGv_i32 arg4, TCGv_i32 arg5,
-                                   TCGv_i32 arg6)
+static inline void tcg_gen_op6_i32(TCGOpcode opc, TCGv_i32 a1, TCGv_i32 a2,
+                                   TCGv_i32 a3, TCGv_i32 a4,
+                                   TCGv_i32 a5, TCGv_i32 a6)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg4);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg5);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg6);
+    tcg_gen_op6(&tcg_ctx, opc, GET_TCGV_I32(a1), GET_TCGV_I32(a2),
+                GET_TCGV_I32(a3), GET_TCGV_I32(a4), GET_TCGV_I32(a5),
+                GET_TCGV_I32(a6));
 }
 
-static inline void tcg_gen_op6_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2,
-                                   TCGv_i64 arg3, TCGv_i64 arg4, TCGv_i64 arg5,
-                                   TCGv_i64 arg6)
+static inline void tcg_gen_op6_i64(TCGOpcode opc, TCGv_i64 a1, TCGv_i64 a2,
+                                   TCGv_i64 a3, TCGv_i64 a4,
+                                   TCGv_i64 a5, TCGv_i64 a6)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg4);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg5);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg6);
+    tcg_gen_op6(&tcg_ctx, opc, GET_TCGV_I64(a1), GET_TCGV_I64(a2),
+                GET_TCGV_I64(a3), GET_TCGV_I64(a4), GET_TCGV_I64(a5),
+                GET_TCGV_I64(a6));
 }
 
-static inline void tcg_gen_op6i_i32(TCGOpcode opc, TCGv_i32 arg1, TCGv_i32 arg2,
-                                    TCGv_i32 arg3, TCGv_i32 arg4,
-                                    TCGv_i32 arg5, TCGArg arg6)
+static inline void tcg_gen_op6i_i32(TCGOpcode opc, TCGv_i32 a1, TCGv_i32 a2,
+                                    TCGv_i32 a3, TCGv_i32 a4,
+                                    TCGv_i32 a5, TCGArg a6)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg4);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg5);
-    *tcg_ctx.gen_opparam_ptr++ = arg6;
+    tcg_gen_op6(&tcg_ctx, opc, GET_TCGV_I32(a1), GET_TCGV_I32(a2),
+                GET_TCGV_I32(a3), GET_TCGV_I32(a4), GET_TCGV_I32(a5), a6);
 }
 
-static inline void tcg_gen_op6i_i64(TCGOpcode opc, TCGv_i64 arg1, TCGv_i64 arg2,
-                                    TCGv_i64 arg3, TCGv_i64 arg4,
-                                    TCGv_i64 arg5, TCGArg arg6)
+static inline void tcg_gen_op6i_i64(TCGOpcode opc, TCGv_i64 a1, TCGv_i64 a2,
+                                    TCGv_i64 a3, TCGv_i64 a4,
+                                    TCGv_i64 a5, TCGArg a6)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg4);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg5);
-    *tcg_ctx.gen_opparam_ptr++ = arg6;
+    tcg_gen_op6(&tcg_ctx, opc, GET_TCGV_I64(a1), GET_TCGV_I64(a2),
+                GET_TCGV_I64(a3), GET_TCGV_I64(a4), GET_TCGV_I64(a5), a6);
 }
 
-static inline void tcg_gen_op6ii_i32(TCGOpcode opc, TCGv_i32 arg1,
-                                     TCGv_i32 arg2, TCGv_i32 arg3,
-                                     TCGv_i32 arg4, TCGArg arg5, TCGArg arg6)
+static inline void tcg_gen_op6ii_i32(TCGOpcode opc, TCGv_i32 a1, TCGv_i32 a2,
+                                     TCGv_i32 a3, TCGv_i32 a4,
+                                     TCGArg a5, TCGArg a6)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(arg4);
-    *tcg_ctx.gen_opparam_ptr++ = arg5;
-    *tcg_ctx.gen_opparam_ptr++ = arg6;
+    tcg_gen_op6(&tcg_ctx, opc, GET_TCGV_I32(a1), GET_TCGV_I32(a2),
+                GET_TCGV_I32(a3), GET_TCGV_I32(a4), a5, a6);
 }
 
-static inline void tcg_gen_op6ii_i64(TCGOpcode opc, TCGv_i64 arg1,
-                                     TCGv_i64 arg2, TCGv_i64 arg3,
-                                     TCGv_i64 arg4, TCGArg arg5, TCGArg arg6)
+static inline void tcg_gen_op6ii_i64(TCGOpcode opc, TCGv_i64 a1, TCGv_i64 a2,
+                                     TCGv_i64 a3, TCGv_i64 a4,
+                                     TCGArg a5, TCGArg a6)
 {
-    *tcg_ctx.gen_opc_ptr++ = opc;
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg1);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg2);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg3);
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(arg4);
-    *tcg_ctx.gen_opparam_ptr++ = arg5;
-    *tcg_ctx.gen_opparam_ptr++ = arg6;
+    tcg_gen_op6(&tcg_ctx, opc, GET_TCGV_I64(a1), GET_TCGV_I64(a2),
+                GET_TCGV_I64(a3), GET_TCGV_I64(a4), a5, a6);
 }
 
-static inline void tcg_add_param_i32(TCGv_i32 val)
-{
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(val);
-}
 
-static inline void tcg_add_param_i64(TCGv_i64 val)
-{
-#if TCG_TARGET_REG_BITS == 32
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(TCGV_LOW(val));
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(TCGV_HIGH(val));
-#else
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(val);
-#endif
-}
+/* Generic ops.  */
+
+int gen_new_label(void);
 
 static inline void gen_set_label(int n)
 {
-    tcg_gen_op1i(INDEX_op_set_label, n);
+    tcg_gen_op1(&tcg_ctx, INDEX_op_set_label, n);
 }
 
 static inline void tcg_gen_br(int label)
 {
-    tcg_gen_op1i(INDEX_op_br, label);
+    tcg_gen_op1(&tcg_ctx, INDEX_op_br, label);
+}
+
+
+/* Helper calls. */
+
+/* 32 bit ops */
+
+void tcg_gen_addi_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2);
+void tcg_gen_subfi_i32(TCGv_i32 ret, int32_t arg1, TCGv_i32 arg2);
+void tcg_gen_subi_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2);
+void tcg_gen_andi_i32(TCGv_i32 ret, TCGv_i32 arg1, uint32_t arg2);
+void tcg_gen_ori_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2);
+void tcg_gen_xori_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2);
+void tcg_gen_shli_i32(TCGv_i32 ret, TCGv_i32 arg1, unsigned arg2);
+void tcg_gen_shri_i32(TCGv_i32 ret, TCGv_i32 arg1, unsigned arg2);
+void tcg_gen_sari_i32(TCGv_i32 ret, TCGv_i32 arg1, unsigned arg2);
+void tcg_gen_muli_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2);
+void tcg_gen_div_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_rem_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_divu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_remu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_andc_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_eqv_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_nand_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_nor_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_orc_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_rotl_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_rotli_i32(TCGv_i32 ret, TCGv_i32 arg1, unsigned arg2);
+void tcg_gen_rotr_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_rotri_i32(TCGv_i32 ret, TCGv_i32 arg1, unsigned arg2);
+void tcg_gen_deposit_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2,
+                         unsigned int ofs, unsigned int len);
+void tcg_gen_brcond_i32(TCGCond cond, TCGv_i32 arg1, TCGv_i32 arg2, int label);
+void tcg_gen_brcondi_i32(TCGCond cond, TCGv_i32 arg1, int32_t arg2, int label);
+void tcg_gen_setcond_i32(TCGCond cond, TCGv_i32 ret,
+                         TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_setcondi_i32(TCGCond cond, TCGv_i32 ret,
+                          TCGv_i32 arg1, int32_t arg2);
+void tcg_gen_movcond_i32(TCGCond cond, TCGv_i32 ret, TCGv_i32 c1,
+                         TCGv_i32 c2, TCGv_i32 v1, TCGv_i32 v2);
+void tcg_gen_add2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 al,
+                      TCGv_i32 ah, TCGv_i32 bl, TCGv_i32 bh);
+void tcg_gen_sub2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 al,
+                      TCGv_i32 ah, TCGv_i32 bl, TCGv_i32 bh);
+void tcg_gen_mulu2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_muls2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 arg1, TCGv_i32 arg2);
+void tcg_gen_ext8s_i32(TCGv_i32 ret, TCGv_i32 arg);
+void tcg_gen_ext16s_i32(TCGv_i32 ret, TCGv_i32 arg);
+void tcg_gen_ext8u_i32(TCGv_i32 ret, TCGv_i32 arg);
+void tcg_gen_ext16u_i32(TCGv_i32 ret, TCGv_i32 arg);
+void tcg_gen_bswap16_i32(TCGv_i32 ret, TCGv_i32 arg);
+void tcg_gen_bswap32_i32(TCGv_i32 ret, TCGv_i32 arg);
+
+static inline void tcg_gen_discard_i32(TCGv_i32 arg)
+{
+    tcg_gen_op1_i32(INDEX_op_discard, arg);
 }
 
 static inline void tcg_gen_mov_i32(TCGv_i32 ret, TCGv_i32 arg)
 {
-    if (!TCGV_EQUAL_I32(ret, arg))
+    if (!TCGV_EQUAL_I32(ret, arg)) {
         tcg_gen_op2_i32(INDEX_op_mov_i32, ret, arg);
+    }
 }
 
 static inline void tcg_gen_movi_i32(TCGv_i32 ret, int32_t arg)
@@ -381,44 +332,50 @@ static inline void tcg_gen_movi_i32(TCGv_i32 ret, int32_t arg)
     tcg_gen_op2i_i32(INDEX_op_movi_i32, ret, arg);
 }
 
-/* 32 bit ops */
-
-static inline void tcg_gen_ld8u_i32(TCGv_i32 ret, TCGv_ptr arg2, tcg_target_long offset)
+static inline void tcg_gen_ld8u_i32(TCGv_i32 ret, TCGv_ptr arg2,
+                                    tcg_target_long offset)
 {
     tcg_gen_ldst_op_i32(INDEX_op_ld8u_i32, ret, arg2, offset);
 }
 
-static inline void tcg_gen_ld8s_i32(TCGv_i32 ret, TCGv_ptr arg2, tcg_target_long offset)
+static inline void tcg_gen_ld8s_i32(TCGv_i32 ret, TCGv_ptr arg2,
+                                    tcg_target_long offset)
 {
     tcg_gen_ldst_op_i32(INDEX_op_ld8s_i32, ret, arg2, offset);
 }
 
-static inline void tcg_gen_ld16u_i32(TCGv_i32 ret, TCGv_ptr arg2, tcg_target_long offset)
+static inline void tcg_gen_ld16u_i32(TCGv_i32 ret, TCGv_ptr arg2,
+                                     tcg_target_long offset)
 {
     tcg_gen_ldst_op_i32(INDEX_op_ld16u_i32, ret, arg2, offset);
 }
 
-static inline void tcg_gen_ld16s_i32(TCGv_i32 ret, TCGv_ptr arg2, tcg_target_long offset)
+static inline void tcg_gen_ld16s_i32(TCGv_i32 ret, TCGv_ptr arg2,
+                                     tcg_target_long offset)
 {
     tcg_gen_ldst_op_i32(INDEX_op_ld16s_i32, ret, arg2, offset);
 }
 
-static inline void tcg_gen_ld_i32(TCGv_i32 ret, TCGv_ptr arg2, tcg_target_long offset)
+static inline void tcg_gen_ld_i32(TCGv_i32 ret, TCGv_ptr arg2,
+                                  tcg_target_long offset)
 {
     tcg_gen_ldst_op_i32(INDEX_op_ld_i32, ret, arg2, offset);
 }
 
-static inline void tcg_gen_st8_i32(TCGv_i32 arg1, TCGv_ptr arg2, tcg_target_long offset)
+static inline void tcg_gen_st8_i32(TCGv_i32 arg1, TCGv_ptr arg2,
+                                   tcg_target_long offset)
 {
     tcg_gen_ldst_op_i32(INDEX_op_st8_i32, arg1, arg2, offset);
 }
 
-static inline void tcg_gen_st16_i32(TCGv_i32 arg1, TCGv_ptr arg2, tcg_target_long offset)
+static inline void tcg_gen_st16_i32(TCGv_i32 arg1, TCGv_ptr arg2,
+                                    tcg_target_long offset)
 {
     tcg_gen_ldst_op_i32(INDEX_op_st16_i32, arg1, arg2, offset);
 }
 
-static inline void tcg_gen_st_i32(TCGv_i32 arg1, TCGv_ptr arg2, tcg_target_long offset)
+static inline void tcg_gen_st_i32(TCGv_i32 arg1, TCGv_ptr arg2,
+                                  tcg_target_long offset)
 {
     tcg_gen_ldst_op_i32(INDEX_op_st_i32, arg1, arg2, offset);
 }
@@ -428,126 +385,24 @@ static inline void tcg_gen_add_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
     tcg_gen_op3_i32(INDEX_op_add_i32, ret, arg1, arg2);
 }
 
-static inline void tcg_gen_addi_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
-{
-    /* some cases can be optimized here */
-    if (arg2 == 0) {
-        tcg_gen_mov_i32(ret, arg1);
-    } else {
-        TCGv_i32 t0 = tcg_const_i32(arg2);
-        tcg_gen_add_i32(ret, arg1, t0);
-        tcg_temp_free_i32(t0);
-    }
-}
-
 static inline void tcg_gen_sub_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
     tcg_gen_op3_i32(INDEX_op_sub_i32, ret, arg1, arg2);
 }
 
-static inline void tcg_gen_subfi_i32(TCGv_i32 ret, int32_t arg1, TCGv_i32 arg2)
-{
-    TCGv_i32 t0 = tcg_const_i32(arg1);
-    tcg_gen_sub_i32(ret, t0, arg2);
-    tcg_temp_free_i32(t0);
-}
-
-static inline void tcg_gen_subi_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
-{
-    /* some cases can be optimized here */
-    if (arg2 == 0) {
-        tcg_gen_mov_i32(ret, arg1);
-    } else {
-        TCGv_i32 t0 = tcg_const_i32(arg2);
-        tcg_gen_sub_i32(ret, arg1, t0);
-        tcg_temp_free_i32(t0);
-    }
-}
-
 static inline void tcg_gen_and_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
-    if (TCGV_EQUAL_I32(arg1, arg2)) {
-        tcg_gen_mov_i32(ret, arg1);
-    } else {
-        tcg_gen_op3_i32(INDEX_op_and_i32, ret, arg1, arg2);
-    }
-}
-
-static inline void tcg_gen_andi_i32(TCGv_i32 ret, TCGv_i32 arg1, uint32_t arg2)
-{
-    TCGv_i32 t0;
-    /* Some cases can be optimized here.  */
-    switch (arg2) {
-    case 0:
-        tcg_gen_movi_i32(ret, 0);
-        return;
-    case 0xffffffffu:
-        tcg_gen_mov_i32(ret, arg1);
-        return;
-    case 0xffu:
-        /* Don't recurse with tcg_gen_ext8u_i32.  */
-        if (TCG_TARGET_HAS_ext8u_i32) {
-            tcg_gen_op2_i32(INDEX_op_ext8u_i32, ret, arg1);
-            return;
-        }
-        break;
-    case 0xffffu:
-        if (TCG_TARGET_HAS_ext16u_i32) {
-            tcg_gen_op2_i32(INDEX_op_ext16u_i32, ret, arg1);
-            return;
-        }
-        break;
-    }
-    t0 = tcg_const_i32(arg2);
-    tcg_gen_and_i32(ret, arg1, t0);
-    tcg_temp_free_i32(t0);
+    tcg_gen_op3_i32(INDEX_op_and_i32, ret, arg1, arg2);
 }
 
 static inline void tcg_gen_or_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
-    if (TCGV_EQUAL_I32(arg1, arg2)) {
-        tcg_gen_mov_i32(ret, arg1);
-    } else {
-        tcg_gen_op3_i32(INDEX_op_or_i32, ret, arg1, arg2);
-    }
-}
-
-static inline void tcg_gen_ori_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
-{
-    /* Some cases can be optimized here.  */
-    if (arg2 == -1) {
-        tcg_gen_movi_i32(ret, -1);
-    } else if (arg2 == 0) {
-        tcg_gen_mov_i32(ret, arg1);
-    } else {
-        TCGv_i32 t0 = tcg_const_i32(arg2);
-        tcg_gen_or_i32(ret, arg1, t0);
-        tcg_temp_free_i32(t0);
-    }
+    tcg_gen_op3_i32(INDEX_op_or_i32, ret, arg1, arg2);
 }
 
 static inline void tcg_gen_xor_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
-    if (TCGV_EQUAL_I32(arg1, arg2)) {
-        tcg_gen_movi_i32(ret, 0);
-    } else {
-        tcg_gen_op3_i32(INDEX_op_xor_i32, ret, arg1, arg2);
-    }
-}
-
-static inline void tcg_gen_xori_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
-{
-    /* Some cases can be optimized here.  */
-    if (arg2 == 0) {
-        tcg_gen_mov_i32(ret, arg1);
-    } else if (arg2 == -1 && TCG_TARGET_HAS_not_i32) {
-        /* Don't recurse with tcg_gen_not_i32.  */
-        tcg_gen_op2_i32(INDEX_op_not_i32, ret, arg1);
-    } else {
-        TCGv_i32 t0 = tcg_const_i32(arg2);
-        tcg_gen_xor_i32(ret, arg1, t0);
-        tcg_temp_free_i32(t0);
-    }
+    tcg_gen_op3_i32(INDEX_op_xor_i32, ret, arg1, arg2);
 }
 
 static inline void tcg_gen_shl_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
@@ -555,1913 +410,322 @@ static inline void tcg_gen_shl_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
     tcg_gen_op3_i32(INDEX_op_shl_i32, ret, arg1, arg2);
 }
 
-static inline void tcg_gen_shli_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
-{
-    if (arg2 == 0) {
-        tcg_gen_mov_i32(ret, arg1);
-    } else {
-        TCGv_i32 t0 = tcg_const_i32(arg2);
-        tcg_gen_shl_i32(ret, arg1, t0);
-        tcg_temp_free_i32(t0);
-    }
-}
-
 static inline void tcg_gen_shr_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
     tcg_gen_op3_i32(INDEX_op_shr_i32, ret, arg1, arg2);
 }
 
-static inline void tcg_gen_shri_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
-{
-    if (arg2 == 0) {
-        tcg_gen_mov_i32(ret, arg1);
-    } else {
-        TCGv_i32 t0 = tcg_const_i32(arg2);
-        tcg_gen_shr_i32(ret, arg1, t0);
-        tcg_temp_free_i32(t0);
-    }
-}
-
 static inline void tcg_gen_sar_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
     tcg_gen_op3_i32(INDEX_op_sar_i32, ret, arg1, arg2);
 }
 
-static inline void tcg_gen_sari_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
-{
-    if (arg2 == 0) {
-        tcg_gen_mov_i32(ret, arg1);
-    } else {
-        TCGv_i32 t0 = tcg_const_i32(arg2);
-        tcg_gen_sar_i32(ret, arg1, t0);
-        tcg_temp_free_i32(t0);
-    }
-}
-
-static inline void tcg_gen_brcond_i32(TCGCond cond, TCGv_i32 arg1,
-                                      TCGv_i32 arg2, int label_index)
-{
-    if (cond == TCG_COND_ALWAYS) {
-        tcg_gen_br(label_index);
-    } else if (cond != TCG_COND_NEVER) {
-        tcg_gen_op4ii_i32(INDEX_op_brcond_i32, arg1, arg2, cond, label_index);
-    }
-}
-
-static inline void tcg_gen_brcondi_i32(TCGCond cond, TCGv_i32 arg1,
-                                       int32_t arg2, int label_index)
-{
-    if (cond == TCG_COND_ALWAYS) {
-        tcg_gen_br(label_index);
-    } else if (cond != TCG_COND_NEVER) {
-        TCGv_i32 t0 = tcg_const_i32(arg2);
-        tcg_gen_brcond_i32(cond, arg1, t0, label_index);
-        tcg_temp_free_i32(t0);
-    }
-}
-
-static inline void tcg_gen_setcond_i32(TCGCond cond, TCGv_i32 ret,
-                                       TCGv_i32 arg1, TCGv_i32 arg2)
-{
-    if (cond == TCG_COND_ALWAYS) {
-        tcg_gen_movi_i32(ret, 1);
-    } else if (cond == TCG_COND_NEVER) {
-        tcg_gen_movi_i32(ret, 0);
-    } else {
-        tcg_gen_op4i_i32(INDEX_op_setcond_i32, ret, arg1, arg2, cond);
-    }
-}
-
-static inline void tcg_gen_setcondi_i32(TCGCond cond, TCGv_i32 ret,
-                                        TCGv_i32 arg1, int32_t arg2)
-{
-    if (cond == TCG_COND_ALWAYS) {
-        tcg_gen_movi_i32(ret, 1);
-    } else if (cond == TCG_COND_NEVER) {
-        tcg_gen_movi_i32(ret, 0);
-    } else {
-        TCGv_i32 t0 = tcg_const_i32(arg2);
-        tcg_gen_setcond_i32(cond, ret, arg1, t0);
-        tcg_temp_free_i32(t0);
-    }
-}
-
 static inline void tcg_gen_mul_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
 {
     tcg_gen_op3_i32(INDEX_op_mul_i32, ret, arg1, arg2);
 }
 
-static inline void tcg_gen_muli_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
-{
-    TCGv_i32 t0 = tcg_const_i32(arg2);
-    tcg_gen_mul_i32(ret, arg1, t0);
-    tcg_temp_free_i32(t0);
-}
-
-static inline void tcg_gen_div_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
+static inline void tcg_gen_neg_i32(TCGv_i32 ret, TCGv_i32 arg)
 {
-    if (TCG_TARGET_HAS_div_i32) {
-        tcg_gen_op3_i32(INDEX_op_div_i32, ret, arg1, arg2);
-    } else if (TCG_TARGET_HAS_div2_i32) {
-        TCGv_i32 t0 = tcg_temp_new_i32();
-        tcg_gen_sari_i32(t0, arg1, 31);
-        tcg_gen_op5_i32(INDEX_op_div2_i32, ret, t0, arg1, t0, arg2);
-        tcg_temp_free_i32(t0);
-    } else {
-        gen_helper_div_i32(ret, arg1, arg2);
-    }
-}
-
-static inline void tcg_gen_rem_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
-{
-    if (TCG_TARGET_HAS_rem_i32) {
-        tcg_gen_op3_i32(INDEX_op_rem_i32, ret, arg1, arg2);
-    } else if (TCG_TARGET_HAS_div_i32) {
-        TCGv_i32 t0 = tcg_temp_new_i32();
-        tcg_gen_op3_i32(INDEX_op_div_i32, t0, arg1, arg2);
-        tcg_gen_mul_i32(t0, t0, arg2);
-        tcg_gen_sub_i32(ret, arg1, t0);
-        tcg_temp_free_i32(t0);
-    } else if (TCG_TARGET_HAS_div2_i32) {
-        TCGv_i32 t0 = tcg_temp_new_i32();
-        tcg_gen_sari_i32(t0, arg1, 31);
-        tcg_gen_op5_i32(INDEX_op_div2_i32, t0, ret, arg1, t0, arg2);
-        tcg_temp_free_i32(t0);
+    if (TCG_TARGET_HAS_neg_i32) {
+        tcg_gen_op2_i32(INDEX_op_neg_i32, ret, arg);
     } else {
-        gen_helper_rem_i32(ret, arg1, arg2);
+        tcg_gen_subfi_i32(ret, 0, arg);
     }
 }
 
-static inline void tcg_gen_divu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
+static inline void tcg_gen_not_i32(TCGv_i32 ret, TCGv_i32 arg)
 {
-    if (TCG_TARGET_HAS_div_i32) {
-        tcg_gen_op3_i32(INDEX_op_divu_i32, ret, arg1, arg2);
-    } else if (TCG_TARGET_HAS_div2_i32) {
-        TCGv_i32 t0 = tcg_temp_new_i32();
-        tcg_gen_movi_i32(t0, 0);
-        tcg_gen_op5_i32(INDEX_op_divu2_i32, ret, t0, arg1, t0, arg2);
-        tcg_temp_free_i32(t0);
+    if (TCG_TARGET_HAS_not_i32) {
+        tcg_gen_op2_i32(INDEX_op_not_i32, ret, arg);
     } else {
-        gen_helper_divu_i32(ret, arg1, arg2);
+        tcg_gen_xori_i32(ret, arg, -1);
     }
 }
 
-static inline void tcg_gen_remu_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
-{
-    if (TCG_TARGET_HAS_rem_i32) {
-        tcg_gen_op3_i32(INDEX_op_remu_i32, ret, arg1, arg2);
-    } else if (TCG_TARGET_HAS_div_i32) {
-        TCGv_i32 t0 = tcg_temp_new_i32();
-        tcg_gen_op3_i32(INDEX_op_divu_i32, t0, arg1, arg2);
-        tcg_gen_mul_i32(t0, t0, arg2);
-        tcg_gen_sub_i32(ret, arg1, t0);
-        tcg_temp_free_i32(t0);
-    } else if (TCG_TARGET_HAS_div2_i32) {
-        TCGv_i32 t0 = tcg_temp_new_i32();
-        tcg_gen_movi_i32(t0, 0);
-        tcg_gen_op5_i32(INDEX_op_divu2_i32, t0, ret, arg1, t0, arg2);
-        tcg_temp_free_i32(t0);
-    } else {
-        gen_helper_remu_i32(ret, arg1, arg2);
-    }
-}
+/* 64 bit ops */
+
+void tcg_gen_addi_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2);
+void tcg_gen_subfi_i64(TCGv_i64 ret, int64_t arg1, TCGv_i64 arg2);
+void tcg_gen_subi_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2);
+void tcg_gen_andi_i64(TCGv_i64 ret, TCGv_i64 arg1, uint64_t arg2);
+void tcg_gen_ori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2);
+void tcg_gen_xori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2);
+void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2);
+void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2);
+void tcg_gen_sari_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2);
+void tcg_gen_muli_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2);
+void tcg_gen_div_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_rem_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_divu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_remu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_andc_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_eqv_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_nand_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_nor_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_orc_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_rotl_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_rotli_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2);
+void tcg_gen_rotr_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_rotri_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2);
+void tcg_gen_deposit_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2,
+                         unsigned int ofs, unsigned int len);
+void tcg_gen_brcond_i64(TCGCond cond, TCGv_i64 arg1, TCGv_i64 arg2, int label);
+void tcg_gen_brcondi_i64(TCGCond cond, TCGv_i64 arg1, int64_t arg2, int label);
+void tcg_gen_setcond_i64(TCGCond cond, TCGv_i64 ret,
+                         TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_setcondi_i64(TCGCond cond, TCGv_i64 ret,
+                          TCGv_i64 arg1, int64_t arg2);
+void tcg_gen_movcond_i64(TCGCond cond, TCGv_i64 ret, TCGv_i64 c1,
+                         TCGv_i64 c2, TCGv_i64 v1, TCGv_i64 v2);
+void tcg_gen_add2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 al,
+                      TCGv_i64 ah, TCGv_i64 bl, TCGv_i64 bh);
+void tcg_gen_sub2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 al,
+                      TCGv_i64 ah, TCGv_i64 bl, TCGv_i64 bh);
+void tcg_gen_mulu2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_muls2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_not_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_ext8s_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_ext16s_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_ext32s_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_ext8u_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_ext16u_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_ext32u_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_bswap16_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_bswap32_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_bswap64_i64(TCGv_i64 ret, TCGv_i64 arg);
 
-#if TCG_TARGET_REG_BITS == 32
+#if TCG_TARGET_REG_BITS == 64
+static inline void tcg_gen_discard_i64(TCGv_i64 arg)
+{
+    tcg_gen_op1_i64(INDEX_op_discard, arg);
+}
 
 static inline void tcg_gen_mov_i64(TCGv_i64 ret, TCGv_i64 arg)
 {
     if (!TCGV_EQUAL_I64(ret, arg)) {
-        tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-        tcg_gen_mov_i32(TCGV_HIGH(ret), TCGV_HIGH(arg));
+        tcg_gen_op2_i64(INDEX_op_mov_i64, ret, arg);
     }
 }
 
 static inline void tcg_gen_movi_i64(TCGv_i64 ret, int64_t arg)
 {
-    tcg_gen_movi_i32(TCGV_LOW(ret), arg);
-    tcg_gen_movi_i32(TCGV_HIGH(ret), arg >> 32);
+    tcg_gen_op2i_i64(INDEX_op_movi_i64, ret, arg);
 }
 
 static inline void tcg_gen_ld8u_i64(TCGv_i64 ret, TCGv_ptr arg2,
                                     tcg_target_long offset)
 {
-    tcg_gen_ld8u_i32(TCGV_LOW(ret), arg2, offset);
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+    tcg_gen_ldst_op_i64(INDEX_op_ld8u_i64, ret, arg2, offset);
 }
 
 static inline void tcg_gen_ld8s_i64(TCGv_i64 ret, TCGv_ptr arg2,
                                     tcg_target_long offset)
 {
-    tcg_gen_ld8s_i32(TCGV_LOW(ret), arg2, offset);
-    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_HIGH(ret), 31);
+    tcg_gen_ldst_op_i64(INDEX_op_ld8s_i64, ret, arg2, offset);
 }
 
 static inline void tcg_gen_ld16u_i64(TCGv_i64 ret, TCGv_ptr arg2,
                                      tcg_target_long offset)
 {
-    tcg_gen_ld16u_i32(TCGV_LOW(ret), arg2, offset);
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+    tcg_gen_ldst_op_i64(INDEX_op_ld16u_i64, ret, arg2, offset);
 }
 
 static inline void tcg_gen_ld16s_i64(TCGv_i64 ret, TCGv_ptr arg2,
                                      tcg_target_long offset)
 {
-    tcg_gen_ld16s_i32(TCGV_LOW(ret), arg2, offset);
-    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+    tcg_gen_ldst_op_i64(INDEX_op_ld16s_i64, ret, arg2, offset);
 }
 
 static inline void tcg_gen_ld32u_i64(TCGv_i64 ret, TCGv_ptr arg2,
                                      tcg_target_long offset)
 {
-    tcg_gen_ld_i32(TCGV_LOW(ret), arg2, offset);
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+    tcg_gen_ldst_op_i64(INDEX_op_ld32u_i64, ret, arg2, offset);
 }
 
 static inline void tcg_gen_ld32s_i64(TCGv_i64 ret, TCGv_ptr arg2,
                                      tcg_target_long offset)
 {
-    tcg_gen_ld_i32(TCGV_LOW(ret), arg2, offset);
-    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+    tcg_gen_ldst_op_i64(INDEX_op_ld32s_i64, ret, arg2, offset);
 }
 
 static inline void tcg_gen_ld_i64(TCGv_i64 ret, TCGv_ptr arg2,
                                   tcg_target_long offset)
 {
-    /* since arg2 and ret have different types, they cannot be the
-       same temporary */
-#ifdef HOST_WORDS_BIGENDIAN
-    tcg_gen_ld_i32(TCGV_HIGH(ret), arg2, offset);
-    tcg_gen_ld_i32(TCGV_LOW(ret), arg2, offset + 4);
-#else
-    tcg_gen_ld_i32(TCGV_LOW(ret), arg2, offset);
-    tcg_gen_ld_i32(TCGV_HIGH(ret), arg2, offset + 4);
-#endif
+    tcg_gen_ldst_op_i64(INDEX_op_ld_i64, ret, arg2, offset);
 }
 
 static inline void tcg_gen_st8_i64(TCGv_i64 arg1, TCGv_ptr arg2,
                                    tcg_target_long offset)
 {
-    tcg_gen_st8_i32(TCGV_LOW(arg1), arg2, offset);
+    tcg_gen_ldst_op_i64(INDEX_op_st8_i64, arg1, arg2, offset);
 }
 
 static inline void tcg_gen_st16_i64(TCGv_i64 arg1, TCGv_ptr arg2,
                                     tcg_target_long offset)
 {
-    tcg_gen_st16_i32(TCGV_LOW(arg1), arg2, offset);
+    tcg_gen_ldst_op_i64(INDEX_op_st16_i64, arg1, arg2, offset);
 }
 
 static inline void tcg_gen_st32_i64(TCGv_i64 arg1, TCGv_ptr arg2,
                                     tcg_target_long offset)
 {
-    tcg_gen_st_i32(TCGV_LOW(arg1), arg2, offset);
+    tcg_gen_ldst_op_i64(INDEX_op_st32_i64, arg1, arg2, offset);
 }
 
 static inline void tcg_gen_st_i64(TCGv_i64 arg1, TCGv_ptr arg2,
                                   tcg_target_long offset)
 {
-#ifdef HOST_WORDS_BIGENDIAN
-    tcg_gen_st_i32(TCGV_HIGH(arg1), arg2, offset);
-    tcg_gen_st_i32(TCGV_LOW(arg1), arg2, offset + 4);
-#else
-    tcg_gen_st_i32(TCGV_LOW(arg1), arg2, offset);
-    tcg_gen_st_i32(TCGV_HIGH(arg1), arg2, offset + 4);
-#endif
+    tcg_gen_ldst_op_i64(INDEX_op_st_i64, arg1, arg2, offset);
 }
 
 static inline void tcg_gen_add_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_op6_i32(INDEX_op_add2_i32, TCGV_LOW(ret), TCGV_HIGH(ret),
-                    TCGV_LOW(arg1), TCGV_HIGH(arg1), TCGV_LOW(arg2),
-                    TCGV_HIGH(arg2));
-    /* Allow the optimizer room to replace add2 with two moves.  */
-    tcg_gen_op0(INDEX_op_nop);
+    tcg_gen_op3_i64(INDEX_op_add_i64, ret, arg1, arg2);
 }
 
 static inline void tcg_gen_sub_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_op6_i32(INDEX_op_sub2_i32, TCGV_LOW(ret), TCGV_HIGH(ret),
-                    TCGV_LOW(arg1), TCGV_HIGH(arg1), TCGV_LOW(arg2),
-                    TCGV_HIGH(arg2));
-    /* Allow the optimizer room to replace sub2 with two moves.  */
-    tcg_gen_op0(INDEX_op_nop);
+    tcg_gen_op3_i64(INDEX_op_sub_i64, ret, arg1, arg2);
 }
 
 static inline void tcg_gen_and_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_and_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
-    tcg_gen_and_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
-}
-
-static inline void tcg_gen_andi_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    tcg_gen_andi_i32(TCGV_LOW(ret), TCGV_LOW(arg1), arg2);
-    tcg_gen_andi_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), arg2 >> 32);
+    tcg_gen_op3_i64(INDEX_op_and_i64, ret, arg1, arg2);
 }
 
 static inline void tcg_gen_or_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_or_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
-    tcg_gen_or_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
-}
-
-static inline void tcg_gen_ori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    tcg_gen_ori_i32(TCGV_LOW(ret), TCGV_LOW(arg1), arg2);
-    tcg_gen_ori_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), arg2 >> 32);
+    tcg_gen_op3_i64(INDEX_op_or_i64, ret, arg1, arg2);
 }
 
 static inline void tcg_gen_xor_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    tcg_gen_xor_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
-    tcg_gen_xor_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
-}
-
-static inline void tcg_gen_xori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    tcg_gen_xori_i32(TCGV_LOW(ret), TCGV_LOW(arg1), arg2);
-    tcg_gen_xori_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), arg2 >> 32);
+    tcg_gen_op3_i64(INDEX_op_xor_i64, ret, arg1, arg2);
 }
 
-/* XXX: use generic code when basic block handling is OK or CPU
-   specific code (x86) */
 static inline void tcg_gen_shl_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    gen_helper_shl_i64(ret, arg1, arg2);
-}
-
-static inline void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    tcg_gen_shifti_i64(ret, arg1, arg2, 0, 0);
+    tcg_gen_op3_i64(INDEX_op_shl_i64, ret, arg1, arg2);
 }
 
 static inline void tcg_gen_shr_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    gen_helper_shr_i64(ret, arg1, arg2);
-}
-
-static inline void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    tcg_gen_shifti_i64(ret, arg1, arg2, 1, 0);
+    tcg_gen_op3_i64(INDEX_op_shr_i64, ret, arg1, arg2);
 }
 
 static inline void tcg_gen_sar_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    gen_helper_sar_i64(ret, arg1, arg2);
-}
-
-static inline void tcg_gen_sari_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    tcg_gen_shifti_i64(ret, arg1, arg2, 1, 1);
-}
-
-static inline void tcg_gen_brcond_i64(TCGCond cond, TCGv_i64 arg1,
-                                      TCGv_i64 arg2, int label_index)
-{
-    if (cond == TCG_COND_ALWAYS) {
-        tcg_gen_br(label_index);
-    } else if (cond != TCG_COND_NEVER) {
-        tcg_gen_op6ii_i32(INDEX_op_brcond2_i32,
-                          TCGV_LOW(arg1), TCGV_HIGH(arg1), TCGV_LOW(arg2),
-                          TCGV_HIGH(arg2), cond, label_index);
-    }
-}
-
-static inline void tcg_gen_setcond_i64(TCGCond cond, TCGv_i64 ret,
-                                       TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    if (cond == TCG_COND_ALWAYS) {
-        tcg_gen_movi_i32(TCGV_LOW(ret), 1);
-    } else if (cond == TCG_COND_NEVER) {
-        tcg_gen_movi_i32(TCGV_LOW(ret), 0);
-    } else {
-        tcg_gen_op6i_i32(INDEX_op_setcond2_i32, TCGV_LOW(ret),
-                         TCGV_LOW(arg1), TCGV_HIGH(arg1),
-                         TCGV_LOW(arg2), TCGV_HIGH(arg2), cond);
-    }
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+    tcg_gen_op3_i64(INDEX_op_sar_i64, ret, arg1, arg2);
 }
 
 static inline void tcg_gen_mul_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    TCGv_i64 t0;
-    TCGv_i32 t1;
-
-    t0 = tcg_temp_new_i64();
-    t1 = tcg_temp_new_i32();
-
-    if (TCG_TARGET_HAS_mulu2_i32) {
-        tcg_gen_op4_i32(INDEX_op_mulu2_i32, TCGV_LOW(t0), TCGV_HIGH(t0),
-                        TCGV_LOW(arg1), TCGV_LOW(arg2));
-        /* Allow the optimizer room to replace mulu2 with two moves.  */
-        tcg_gen_op0(INDEX_op_nop);
-    } else {
-        tcg_debug_assert(TCG_TARGET_HAS_muluh_i32);
-        tcg_gen_op3_i32(INDEX_op_mul_i32, TCGV_LOW(t0),
-                        TCGV_LOW(arg1), TCGV_LOW(arg2));
-        tcg_gen_op3_i32(INDEX_op_muluh_i32, TCGV_HIGH(t0),
-                        TCGV_LOW(arg1), TCGV_LOW(arg2));
-    }
-
-    tcg_gen_mul_i32(t1, TCGV_LOW(arg1), TCGV_HIGH(arg2));
-    tcg_gen_add_i32(TCGV_HIGH(t0), TCGV_HIGH(t0), t1);
-    tcg_gen_mul_i32(t1, TCGV_HIGH(arg1), TCGV_LOW(arg2));
-    tcg_gen_add_i32(TCGV_HIGH(t0), TCGV_HIGH(t0), t1);
-
-    tcg_gen_mov_i64(ret, t0);
-    tcg_temp_free_i64(t0);
-    tcg_temp_free_i32(t1);
+    tcg_gen_op3_i64(INDEX_op_mul_i64, ret, arg1, arg2);
 }
-
-static inline void tcg_gen_div_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+#else /* TCG_TARGET_REG_BITS == 32 */
+static inline void tcg_gen_st8_i64(TCGv_i64 arg1, TCGv_ptr arg2,
+                                   tcg_target_long offset)
 {
-    gen_helper_div_i64(ret, arg1, arg2);
+    tcg_gen_st8_i32(TCGV_LOW(arg1), arg2, offset);
 }
 
-static inline void tcg_gen_rem_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+static inline void tcg_gen_st16_i64(TCGv_i64 arg1, TCGv_ptr arg2,
+                                    tcg_target_long offset)
 {
-    gen_helper_rem_i64(ret, arg1, arg2);
+    tcg_gen_st16_i32(TCGV_LOW(arg1), arg2, offset);
 }
 
-static inline void tcg_gen_divu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+static inline void tcg_gen_st32_i64(TCGv_i64 arg1, TCGv_ptr arg2,
+                                    tcg_target_long offset)
 {
-    gen_helper_divu_i64(ret, arg1, arg2);
+    tcg_gen_st_i32(TCGV_LOW(arg1), arg2, offset);
 }
 
-static inline void tcg_gen_remu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
+static inline void tcg_gen_add_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    gen_helper_remu_i64(ret, arg1, arg2);
+    tcg_gen_add2_i32(TCGV_LOW(ret), TCGV_HIGH(ret), TCGV_LOW(arg1),
+                     TCGV_HIGH(arg1), TCGV_LOW(arg2), TCGV_HIGH(arg2));
 }
 
-#else
-
-static inline void tcg_gen_mov_i64(TCGv_i64 ret, TCGv_i64 arg)
+static inline void tcg_gen_sub_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-    if (!TCGV_EQUAL_I64(ret, arg))
-        tcg_gen_op2_i64(INDEX_op_mov_i64, ret, arg);
-}
+    tcg_gen_sub2_i32(TCGV_LOW(ret), TCGV_HIGH(ret), TCGV_LOW(arg1),
+                     TCGV_HIGH(arg1), TCGV_LOW(arg2), TCGV_HIGH(arg2));
+}
+
+void tcg_gen_discard_i64(TCGv_i64 arg);
+void tcg_gen_mov_i64(TCGv_i64 ret, TCGv_i64 arg);
+void tcg_gen_movi_i64(TCGv_i64 ret, int64_t arg);
+void tcg_gen_ld8u_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset);
+void tcg_gen_ld8s_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset);
+void tcg_gen_ld16u_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset);
+void tcg_gen_ld16s_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset);
+void tcg_gen_ld32u_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset);
+void tcg_gen_ld32s_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset);
+void tcg_gen_ld_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset);
+void tcg_gen_st_i64(TCGv_i64 arg1, TCGv_ptr arg2, tcg_target_long offset);
+void tcg_gen_and_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_or_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_xor_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_shl_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_shr_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_sar_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+void tcg_gen_mul_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2);
+#endif /* TCG_TARGET_REG_BITS */
 
-static inline void tcg_gen_movi_i64(TCGv_i64 ret, int64_t arg)
+static inline void tcg_gen_neg_i64(TCGv_i64 ret, TCGv_i64 arg)
 {
-    tcg_gen_op2i_i64(INDEX_op_movi_i64, ret, arg);
+    if (TCG_TARGET_HAS_neg_i64) {
+        tcg_gen_op2_i64(INDEX_op_neg_i64, ret, arg);
+    } else {
+        tcg_gen_subfi_i64(ret, 0, arg);
+    }
 }
 
-static inline void tcg_gen_ld8u_i64(TCGv_i64 ret, TCGv_ptr arg2,
-                                    tcg_target_long offset)
-{
-    tcg_gen_ldst_op_i64(INDEX_op_ld8u_i64, ret, arg2, offset);
-}
+/* Size changing operations.  */
 
-static inline void tcg_gen_ld8s_i64(TCGv_i64 ret, TCGv_ptr arg2,
-                                    tcg_target_long offset)
-{
-    tcg_gen_ldst_op_i64(INDEX_op_ld8s_i64, ret, arg2, offset);
-}
+void tcg_gen_extu_i32_i64(TCGv_i64 ret, TCGv_i32 arg);
+void tcg_gen_ext_i32_i64(TCGv_i64 ret, TCGv_i32 arg);
+void tcg_gen_concat_i32_i64(TCGv_i64 dest, TCGv_i32 low, TCGv_i32 high);
+void tcg_gen_trunc_shr_i64_i32(TCGv_i32 ret, TCGv_i64 arg, unsigned int c);
+void tcg_gen_extr_i64_i32(TCGv_i32 lo, TCGv_i32 hi, TCGv_i64 arg);
+void tcg_gen_extr32_i64(TCGv_i64 lo, TCGv_i64 hi, TCGv_i64 arg);
 
-static inline void tcg_gen_ld16u_i64(TCGv_i64 ret, TCGv_ptr arg2,
-                                     tcg_target_long offset)
+static inline void tcg_gen_concat32_i64(TCGv_i64 ret, TCGv_i64 lo, TCGv_i64 hi)
 {
-    tcg_gen_ldst_op_i64(INDEX_op_ld16u_i64, ret, arg2, offset);
+    tcg_gen_deposit_i64(ret, lo, hi, 32, 32);
 }
 
-static inline void tcg_gen_ld16s_i64(TCGv_i64 ret, TCGv_ptr arg2,
-                                     tcg_target_long offset)
+static inline void tcg_gen_trunc_i64_i32(TCGv_i32 ret, TCGv_i64 arg)
 {
-    tcg_gen_ldst_op_i64(INDEX_op_ld16s_i64, ret, arg2, offset);
+    tcg_gen_trunc_shr_i64_i32(ret, arg, 0);
 }
 
-static inline void tcg_gen_ld32u_i64(TCGv_i64 ret, TCGv_ptr arg2,
-                                     tcg_target_long offset)
-{
-    tcg_gen_ldst_op_i64(INDEX_op_ld32u_i64, ret, arg2, offset);
-}
+/* QEMU specific operations.  */
 
-static inline void tcg_gen_ld32s_i64(TCGv_i64 ret, TCGv_ptr arg2,
-                                     tcg_target_long offset)
-{
-    tcg_gen_ldst_op_i64(INDEX_op_ld32s_i64, ret, arg2, offset);
-}
+#ifndef TARGET_LONG_BITS
+#error must include QEMU headers
+#endif
 
-static inline void tcg_gen_ld_i64(TCGv_i64 ret, TCGv_ptr arg2, tcg_target_long offset)
+/* debug info: write the PC of the corresponding QEMU CPU instruction */
+static inline void tcg_gen_debug_insn_start(uint64_t pc)
 {
-    tcg_gen_ldst_op_i64(INDEX_op_ld_i64, ret, arg2, offset);
+    /* XXX: must really use a 32 bit size for TCGArg in all cases */
+#if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
+    tcg_gen_op2ii(INDEX_op_debug_insn_start,
+                  (uint32_t)(pc), (uint32_t)(pc >> 32));
+#else
+    tcg_gen_op1i(INDEX_op_debug_insn_start, pc);
+#endif
 }
 
-static inline void tcg_gen_st8_i64(TCGv_i64 arg1, TCGv_ptr arg2,
-                                   tcg_target_long offset)
+static inline void tcg_gen_exit_tb(uintptr_t val)
 {
-    tcg_gen_ldst_op_i64(INDEX_op_st8_i64, arg1, arg2, offset);
+    tcg_gen_op1i(INDEX_op_exit_tb, val);
 }
 
-static inline void tcg_gen_st16_i64(TCGv_i64 arg1, TCGv_ptr arg2,
-                                    tcg_target_long offset)
-{
-    tcg_gen_ldst_op_i64(INDEX_op_st16_i64, arg1, arg2, offset);
-}
-
-static inline void tcg_gen_st32_i64(TCGv_i64 arg1, TCGv_ptr arg2,
-                                    tcg_target_long offset)
-{
-    tcg_gen_ldst_op_i64(INDEX_op_st32_i64, arg1, arg2, offset);
-}
-
-static inline void tcg_gen_st_i64(TCGv_i64 arg1, TCGv_ptr arg2, tcg_target_long offset)
-{
-    tcg_gen_ldst_op_i64(INDEX_op_st_i64, arg1, arg2, offset);
-}
-
-static inline void tcg_gen_add_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    tcg_gen_op3_i64(INDEX_op_add_i64, ret, arg1, arg2);
-}
-
-static inline void tcg_gen_sub_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    tcg_gen_op3_i64(INDEX_op_sub_i64, ret, arg1, arg2);
-}
-
-static inline void tcg_gen_and_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    if (TCGV_EQUAL_I64(arg1, arg2)) {
-        tcg_gen_mov_i64(ret, arg1);
-    } else {
-        tcg_gen_op3_i64(INDEX_op_and_i64, ret, arg1, arg2);
-    }
-}
-
-static inline void tcg_gen_andi_i64(TCGv_i64 ret, TCGv_i64 arg1, uint64_t arg2)
-{
-    TCGv_i64 t0;
-    /* Some cases can be optimized here.  */
-    switch (arg2) {
-    case 0:
-        tcg_gen_movi_i64(ret, 0);
-        return;
-    case 0xffffffffffffffffull:
-        tcg_gen_mov_i64(ret, arg1);
-        return;
-    case 0xffull:
-        /* Don't recurse with tcg_gen_ext8u_i32.  */
-        if (TCG_TARGET_HAS_ext8u_i64) {
-            tcg_gen_op2_i64(INDEX_op_ext8u_i64, ret, arg1);
-            return;
-        }
-        break;
-    case 0xffffu:
-        if (TCG_TARGET_HAS_ext16u_i64) {
-            tcg_gen_op2_i64(INDEX_op_ext16u_i64, ret, arg1);
-            return;
-        }
-        break;
-    case 0xffffffffull:
-        if (TCG_TARGET_HAS_ext32u_i64) {
-            tcg_gen_op2_i64(INDEX_op_ext32u_i64, ret, arg1);
-            return;
-        }
-        break;
-    }
-    t0 = tcg_const_i64(arg2);
-    tcg_gen_and_i64(ret, arg1, t0);
-    tcg_temp_free_i64(t0);
-}
-
-static inline void tcg_gen_or_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    if (TCGV_EQUAL_I64(arg1, arg2)) {
-        tcg_gen_mov_i64(ret, arg1);
-    } else {
-        tcg_gen_op3_i64(INDEX_op_or_i64, ret, arg1, arg2);
-    }
-}
-
-static inline void tcg_gen_ori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    /* Some cases can be optimized here.  */
-    if (arg2 == -1) {
-        tcg_gen_movi_i64(ret, -1);
-    } else if (arg2 == 0) {
-        tcg_gen_mov_i64(ret, arg1);
-    } else {
-        TCGv_i64 t0 = tcg_const_i64(arg2);
-        tcg_gen_or_i64(ret, arg1, t0);
-        tcg_temp_free_i64(t0);
-    }
-}
-
-static inline void tcg_gen_xor_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    if (TCGV_EQUAL_I64(arg1, arg2)) {
-        tcg_gen_movi_i64(ret, 0);
-    } else {
-        tcg_gen_op3_i64(INDEX_op_xor_i64, ret, arg1, arg2);
-    }
-}
-
-static inline void tcg_gen_xori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    /* Some cases can be optimized here.  */
-    if (arg2 == 0) {
-        tcg_gen_mov_i64(ret, arg1);
-    } else if (arg2 == -1 && TCG_TARGET_HAS_not_i64) {
-        /* Don't recurse with tcg_gen_not_i64.  */
-        tcg_gen_op2_i64(INDEX_op_not_i64, ret, arg1);
-    } else {
-        TCGv_i64 t0 = tcg_const_i64(arg2);
-        tcg_gen_xor_i64(ret, arg1, t0);
-        tcg_temp_free_i64(t0);
-    }
-}
-
-static inline void tcg_gen_shl_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    tcg_gen_op3_i64(INDEX_op_shl_i64, ret, arg1, arg2);
-}
-
-static inline void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    if (arg2 == 0) {
-        tcg_gen_mov_i64(ret, arg1);
-    } else {
-        TCGv_i64 t0 = tcg_const_i64(arg2);
-        tcg_gen_shl_i64(ret, arg1, t0);
-        tcg_temp_free_i64(t0);
-    }
-}
-
-static inline void tcg_gen_shr_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    tcg_gen_op3_i64(INDEX_op_shr_i64, ret, arg1, arg2);
-}
-
-static inline void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    if (arg2 == 0) {
-        tcg_gen_mov_i64(ret, arg1);
-    } else {
-        TCGv_i64 t0 = tcg_const_i64(arg2);
-        tcg_gen_shr_i64(ret, arg1, t0);
-        tcg_temp_free_i64(t0);
-    }
-}
-
-static inline void tcg_gen_sar_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    tcg_gen_op3_i64(INDEX_op_sar_i64, ret, arg1, arg2);
-}
-
-static inline void tcg_gen_sari_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    if (arg2 == 0) {
-        tcg_gen_mov_i64(ret, arg1);
-    } else {
-        TCGv_i64 t0 = tcg_const_i64(arg2);
-        tcg_gen_sar_i64(ret, arg1, t0);
-        tcg_temp_free_i64(t0);
-    }
-}
-
-static inline void tcg_gen_brcond_i64(TCGCond cond, TCGv_i64 arg1,
-                                      TCGv_i64 arg2, int label_index)
-{
-    if (cond == TCG_COND_ALWAYS) {
-        tcg_gen_br(label_index);
-    } else if (cond != TCG_COND_NEVER) {
-        tcg_gen_op4ii_i64(INDEX_op_brcond_i64, arg1, arg2, cond, label_index);
-    }
-}
-
-static inline void tcg_gen_setcond_i64(TCGCond cond, TCGv_i64 ret,
-                                       TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    if (cond == TCG_COND_ALWAYS) {
-        tcg_gen_movi_i64(ret, 1);
-    } else if (cond == TCG_COND_NEVER) {
-        tcg_gen_movi_i64(ret, 0);
-    } else {
-        tcg_gen_op4i_i64(INDEX_op_setcond_i64, ret, arg1, arg2, cond);
-    }
-}
-
-static inline void tcg_gen_mul_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    tcg_gen_op3_i64(INDEX_op_mul_i64, ret, arg1, arg2);
-}
-
-static inline void tcg_gen_div_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    if (TCG_TARGET_HAS_div_i64) {
-        tcg_gen_op3_i64(INDEX_op_div_i64, ret, arg1, arg2);
-    } else if (TCG_TARGET_HAS_div2_i64) {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_sari_i64(t0, arg1, 63);
-        tcg_gen_op5_i64(INDEX_op_div2_i64, ret, t0, arg1, t0, arg2);
-        tcg_temp_free_i64(t0);
-    } else {
-        gen_helper_div_i64(ret, arg1, arg2);
-    }
-}
-
-static inline void tcg_gen_rem_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    if (TCG_TARGET_HAS_rem_i64) {
-        tcg_gen_op3_i64(INDEX_op_rem_i64, ret, arg1, arg2);
-    } else if (TCG_TARGET_HAS_div_i64) {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_op3_i64(INDEX_op_div_i64, t0, arg1, arg2);
-        tcg_gen_mul_i64(t0, t0, arg2);
-        tcg_gen_sub_i64(ret, arg1, t0);
-        tcg_temp_free_i64(t0);
-    } else if (TCG_TARGET_HAS_div2_i64) {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_sari_i64(t0, arg1, 63);
-        tcg_gen_op5_i64(INDEX_op_div2_i64, t0, ret, arg1, t0, arg2);
-        tcg_temp_free_i64(t0);
-    } else {
-        gen_helper_rem_i64(ret, arg1, arg2);
-    }
-}
-
-static inline void tcg_gen_divu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    if (TCG_TARGET_HAS_div_i64) {
-        tcg_gen_op3_i64(INDEX_op_divu_i64, ret, arg1, arg2);
-    } else if (TCG_TARGET_HAS_div2_i64) {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_movi_i64(t0, 0);
-        tcg_gen_op5_i64(INDEX_op_divu2_i64, ret, t0, arg1, t0, arg2);
-        tcg_temp_free_i64(t0);
-    } else {
-        gen_helper_divu_i64(ret, arg1, arg2);
-    }
-}
-
-static inline void tcg_gen_remu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    if (TCG_TARGET_HAS_rem_i64) {
-        tcg_gen_op3_i64(INDEX_op_remu_i64, ret, arg1, arg2);
-    } else if (TCG_TARGET_HAS_div_i64) {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_op3_i64(INDEX_op_divu_i64, t0, arg1, arg2);
-        tcg_gen_mul_i64(t0, t0, arg2);
-        tcg_gen_sub_i64(ret, arg1, t0);
-        tcg_temp_free_i64(t0);
-    } else if (TCG_TARGET_HAS_div2_i64) {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_movi_i64(t0, 0);
-        tcg_gen_op5_i64(INDEX_op_divu2_i64, t0, ret, arg1, t0, arg2);
-        tcg_temp_free_i64(t0);
-    } else {
-        gen_helper_remu_i64(ret, arg1, arg2);
-    }
-}
-#endif /* TCG_TARGET_REG_BITS == 32 */
-
-static inline void tcg_gen_addi_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    /* some cases can be optimized here */
-    if (arg2 == 0) {
-        tcg_gen_mov_i64(ret, arg1);
-    } else {
-        TCGv_i64 t0 = tcg_const_i64(arg2);
-        tcg_gen_add_i64(ret, arg1, t0);
-        tcg_temp_free_i64(t0);
-    }
-}
-
-static inline void tcg_gen_subfi_i64(TCGv_i64 ret, int64_t arg1, TCGv_i64 arg2)
-{
-    TCGv_i64 t0 = tcg_const_i64(arg1);
-    tcg_gen_sub_i64(ret, t0, arg2);
-    tcg_temp_free_i64(t0);
-}
-
-static inline void tcg_gen_subi_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    /* some cases can be optimized here */
-    if (arg2 == 0) {
-        tcg_gen_mov_i64(ret, arg1);
-    } else {
-        TCGv_i64 t0 = tcg_const_i64(arg2);
-        tcg_gen_sub_i64(ret, arg1, t0);
-        tcg_temp_free_i64(t0);
-    }
-}
-static inline void tcg_gen_brcondi_i64(TCGCond cond, TCGv_i64 arg1,
-                                       int64_t arg2, int label_index)
-{
-    if (cond == TCG_COND_ALWAYS) {
-        tcg_gen_br(label_index);
-    } else if (cond != TCG_COND_NEVER) {
-        TCGv_i64 t0 = tcg_const_i64(arg2);
-        tcg_gen_brcond_i64(cond, arg1, t0, label_index);
-        tcg_temp_free_i64(t0);
-    }
-}
-
-static inline void tcg_gen_setcondi_i64(TCGCond cond, TCGv_i64 ret,
-                                        TCGv_i64 arg1, int64_t arg2)
-{
-    TCGv_i64 t0 = tcg_const_i64(arg2);
-    tcg_gen_setcond_i64(cond, ret, arg1, t0);
-    tcg_temp_free_i64(t0);
-}
-
-static inline void tcg_gen_muli_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    TCGv_i64 t0 = tcg_const_i64(arg2);
-    tcg_gen_mul_i64(ret, arg1, t0);
-    tcg_temp_free_i64(t0);
-}
-
-
-/***************************************/
-/* optional operations */
-
-static inline void tcg_gen_ext8s_i32(TCGv_i32 ret, TCGv_i32 arg)
-{
-    if (TCG_TARGET_HAS_ext8s_i32) {
-        tcg_gen_op2_i32(INDEX_op_ext8s_i32, ret, arg);
-    } else {
-        tcg_gen_shli_i32(ret, arg, 24);
-        tcg_gen_sari_i32(ret, ret, 24);
-    }
-}
-
-static inline void tcg_gen_ext16s_i32(TCGv_i32 ret, TCGv_i32 arg)
-{
-    if (TCG_TARGET_HAS_ext16s_i32) {
-        tcg_gen_op2_i32(INDEX_op_ext16s_i32, ret, arg);
-    } else {
-        tcg_gen_shli_i32(ret, arg, 16);
-        tcg_gen_sari_i32(ret, ret, 16);
-    }
-}
-
-static inline void tcg_gen_ext8u_i32(TCGv_i32 ret, TCGv_i32 arg)
-{
-    if (TCG_TARGET_HAS_ext8u_i32) {
-        tcg_gen_op2_i32(INDEX_op_ext8u_i32, ret, arg);
-    } else {
-        tcg_gen_andi_i32(ret, arg, 0xffu);
-    }
-}
-
-static inline void tcg_gen_ext16u_i32(TCGv_i32 ret, TCGv_i32 arg)
-{
-    if (TCG_TARGET_HAS_ext16u_i32) {
-        tcg_gen_op2_i32(INDEX_op_ext16u_i32, ret, arg);
-    } else {
-        tcg_gen_andi_i32(ret, arg, 0xffffu);
-    }
-}
-
-/* Note: we assume the two high bytes are set to zero */
-static inline void tcg_gen_bswap16_i32(TCGv_i32 ret, TCGv_i32 arg)
-{
-    if (TCG_TARGET_HAS_bswap16_i32) {
-        tcg_gen_op2_i32(INDEX_op_bswap16_i32, ret, arg);
-    } else {
-        TCGv_i32 t0 = tcg_temp_new_i32();
-    
-        tcg_gen_ext8u_i32(t0, arg);
-        tcg_gen_shli_i32(t0, t0, 8);
-        tcg_gen_shri_i32(ret, arg, 8);
-        tcg_gen_or_i32(ret, ret, t0);
-        tcg_temp_free_i32(t0);
-    }
-}
-
-static inline void tcg_gen_bswap32_i32(TCGv_i32 ret, TCGv_i32 arg)
-{
-    if (TCG_TARGET_HAS_bswap32_i32) {
-        tcg_gen_op2_i32(INDEX_op_bswap32_i32, ret, arg);
-    } else {
-        TCGv_i32 t0, t1;
-        t0 = tcg_temp_new_i32();
-        t1 = tcg_temp_new_i32();
-    
-        tcg_gen_shli_i32(t0, arg, 24);
-    
-        tcg_gen_andi_i32(t1, arg, 0x0000ff00);
-        tcg_gen_shli_i32(t1, t1, 8);
-        tcg_gen_or_i32(t0, t0, t1);
-    
-        tcg_gen_shri_i32(t1, arg, 8);
-        tcg_gen_andi_i32(t1, t1, 0x0000ff00);
-        tcg_gen_or_i32(t0, t0, t1);
-    
-        tcg_gen_shri_i32(t1, arg, 24);
-        tcg_gen_or_i32(ret, t0, t1);
-        tcg_temp_free_i32(t0);
-        tcg_temp_free_i32(t1);
-    }
-}
-
-#if TCG_TARGET_REG_BITS == 32
-static inline void tcg_gen_ext8s_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    tcg_gen_ext8s_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
-}
-
-static inline void tcg_gen_ext16s_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    tcg_gen_ext16s_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
-}
-
-static inline void tcg_gen_ext32s_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
-}
-
-static inline void tcg_gen_ext8u_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    tcg_gen_ext8u_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
-}
-
-static inline void tcg_gen_ext16u_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    tcg_gen_ext16u_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
-}
-
-static inline void tcg_gen_ext32u_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
-}
-
-static inline void tcg_gen_trunc_shr_i64_i32(TCGv_i32 ret, TCGv_i64 arg,
-                                             unsigned int count)
-{
-    tcg_debug_assert(count < 64);
-    if (count >= 32) {
-        tcg_gen_shri_i32(ret, TCGV_HIGH(arg), count - 32);
-    } else if (count == 0) {
-        tcg_gen_mov_i32(ret, TCGV_LOW(arg));
-    } else {
-        TCGv_i64 t = tcg_temp_new_i64();
-        tcg_gen_shri_i64(t, arg, count);
-        tcg_gen_mov_i32(ret, TCGV_LOW(t));
-        tcg_temp_free_i64(t);
-    }
-}
-
-static inline void tcg_gen_extu_i32_i64(TCGv_i64 ret, TCGv_i32 arg)
-{
-    tcg_gen_mov_i32(TCGV_LOW(ret), arg);
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
-}
-
-static inline void tcg_gen_ext_i32_i64(TCGv_i64 ret, TCGv_i32 arg)
-{
-    tcg_gen_mov_i32(TCGV_LOW(ret), arg);
-    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
-}
-
-/* Note: we assume the six high bytes are set to zero */
-static inline void tcg_gen_bswap16_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    tcg_gen_mov_i32(TCGV_HIGH(ret), TCGV_HIGH(arg));
-    tcg_gen_bswap16_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-}
-
-/* Note: we assume the four high bytes are set to zero */
-static inline void tcg_gen_bswap32_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    tcg_gen_mov_i32(TCGV_HIGH(ret), TCGV_HIGH(arg));
-    tcg_gen_bswap32_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-}
-
-static inline void tcg_gen_bswap64_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    TCGv_i32 t0, t1;
-    t0 = tcg_temp_new_i32();
-    t1 = tcg_temp_new_i32();
-
-    tcg_gen_bswap32_i32(t0, TCGV_LOW(arg));
-    tcg_gen_bswap32_i32(t1, TCGV_HIGH(arg));
-    tcg_gen_mov_i32(TCGV_LOW(ret), t1);
-    tcg_gen_mov_i32(TCGV_HIGH(ret), t0);
-    tcg_temp_free_i32(t0);
-    tcg_temp_free_i32(t1);
-}
-#else
-
-static inline void tcg_gen_ext8s_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    if (TCG_TARGET_HAS_ext8s_i64) {
-        tcg_gen_op2_i64(INDEX_op_ext8s_i64, ret, arg);
-    } else {
-        tcg_gen_shli_i64(ret, arg, 56);
-        tcg_gen_sari_i64(ret, ret, 56);
-    }
-}
-
-static inline void tcg_gen_ext16s_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    if (TCG_TARGET_HAS_ext16s_i64) {
-        tcg_gen_op2_i64(INDEX_op_ext16s_i64, ret, arg);
-    } else {
-        tcg_gen_shli_i64(ret, arg, 48);
-        tcg_gen_sari_i64(ret, ret, 48);
-    }
-}
-
-static inline void tcg_gen_ext32s_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    if (TCG_TARGET_HAS_ext32s_i64) {
-        tcg_gen_op2_i64(INDEX_op_ext32s_i64, ret, arg);
-    } else {
-        tcg_gen_shli_i64(ret, arg, 32);
-        tcg_gen_sari_i64(ret, ret, 32);
-    }
-}
-
-static inline void tcg_gen_ext8u_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    if (TCG_TARGET_HAS_ext8u_i64) {
-        tcg_gen_op2_i64(INDEX_op_ext8u_i64, ret, arg);
-    } else {
-        tcg_gen_andi_i64(ret, arg, 0xffu);
-    }
-}
-
-static inline void tcg_gen_ext16u_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    if (TCG_TARGET_HAS_ext16u_i64) {
-        tcg_gen_op2_i64(INDEX_op_ext16u_i64, ret, arg);
-    } else {
-        tcg_gen_andi_i64(ret, arg, 0xffffu);
-    }
-}
-
-static inline void tcg_gen_ext32u_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    if (TCG_TARGET_HAS_ext32u_i64) {
-        tcg_gen_op2_i64(INDEX_op_ext32u_i64, ret, arg);
-    } else {
-        tcg_gen_andi_i64(ret, arg, 0xffffffffu);
-    }
-}
-
-static inline void tcg_gen_trunc_shr_i64_i32(TCGv_i32 ret, TCGv_i64 arg,
-                                             unsigned int count)
-{
-    tcg_debug_assert(count < 64);
-    if (TCG_TARGET_HAS_trunc_shr_i32) {
-        tcg_gen_op3i_i32(INDEX_op_trunc_shr_i32, ret,
-                         MAKE_TCGV_I32(GET_TCGV_I64(arg)), count);
-    } else if (count == 0) {
-        tcg_gen_mov_i32(ret, MAKE_TCGV_I32(GET_TCGV_I64(arg)));
-    } else {
-        TCGv_i64 t = tcg_temp_new_i64();
-        tcg_gen_shri_i64(t, arg, count);
-        tcg_gen_mov_i32(ret, MAKE_TCGV_I32(GET_TCGV_I64(t)));
-        tcg_temp_free_i64(t);
-    }
-}
-
-/* Note: we assume the target supports move between 32 and 64 bit
-   registers */
-static inline void tcg_gen_extu_i32_i64(TCGv_i64 ret, TCGv_i32 arg)
-{
-    tcg_gen_ext32u_i64(ret, MAKE_TCGV_I64(GET_TCGV_I32(arg)));
-}
-
-/* Note: we assume the target supports move between 32 and 64 bit
-   registers */
-static inline void tcg_gen_ext_i32_i64(TCGv_i64 ret, TCGv_i32 arg)
-{
-    tcg_gen_ext32s_i64(ret, MAKE_TCGV_I64(GET_TCGV_I32(arg)));
-}
-
-/* Note: we assume the six high bytes are set to zero */
-static inline void tcg_gen_bswap16_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    if (TCG_TARGET_HAS_bswap16_i64) {
-        tcg_gen_op2_i64(INDEX_op_bswap16_i64, ret, arg);
-    } else {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-
-        tcg_gen_ext8u_i64(t0, arg);
-        tcg_gen_shli_i64(t0, t0, 8);
-        tcg_gen_shri_i64(ret, arg, 8);
-        tcg_gen_or_i64(ret, ret, t0);
-        tcg_temp_free_i64(t0);
-    }
-}
-
-/* Note: we assume the four high bytes are set to zero */
-static inline void tcg_gen_bswap32_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    if (TCG_TARGET_HAS_bswap32_i64) {
-        tcg_gen_op2_i64(INDEX_op_bswap32_i64, ret, arg);
-    } else {
-        TCGv_i64 t0, t1;
-        t0 = tcg_temp_new_i64();
-        t1 = tcg_temp_new_i64();
-
-        tcg_gen_shli_i64(t0, arg, 24);
-        tcg_gen_ext32u_i64(t0, t0);
-
-        tcg_gen_andi_i64(t1, arg, 0x0000ff00);
-        tcg_gen_shli_i64(t1, t1, 8);
-        tcg_gen_or_i64(t0, t0, t1);
-
-        tcg_gen_shri_i64(t1, arg, 8);
-        tcg_gen_andi_i64(t1, t1, 0x0000ff00);
-        tcg_gen_or_i64(t0, t0, t1);
-
-        tcg_gen_shri_i64(t1, arg, 24);
-        tcg_gen_or_i64(ret, t0, t1);
-        tcg_temp_free_i64(t0);
-        tcg_temp_free_i64(t1);
-    }
-}
-
-static inline void tcg_gen_bswap64_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    if (TCG_TARGET_HAS_bswap64_i64) {
-        tcg_gen_op2_i64(INDEX_op_bswap64_i64, ret, arg);
-    } else {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        TCGv_i64 t1 = tcg_temp_new_i64();
-    
-        tcg_gen_shli_i64(t0, arg, 56);
-    
-        tcg_gen_andi_i64(t1, arg, 0x0000ff00);
-        tcg_gen_shli_i64(t1, t1, 40);
-        tcg_gen_or_i64(t0, t0, t1);
-    
-        tcg_gen_andi_i64(t1, arg, 0x00ff0000);
-        tcg_gen_shli_i64(t1, t1, 24);
-        tcg_gen_or_i64(t0, t0, t1);
-
-        tcg_gen_andi_i64(t1, arg, 0xff000000);
-        tcg_gen_shli_i64(t1, t1, 8);
-        tcg_gen_or_i64(t0, t0, t1);
-
-        tcg_gen_shri_i64(t1, arg, 8);
-        tcg_gen_andi_i64(t1, t1, 0xff000000);
-        tcg_gen_or_i64(t0, t0, t1);
-    
-        tcg_gen_shri_i64(t1, arg, 24);
-        tcg_gen_andi_i64(t1, t1, 0x00ff0000);
-        tcg_gen_or_i64(t0, t0, t1);
-
-        tcg_gen_shri_i64(t1, arg, 40);
-        tcg_gen_andi_i64(t1, t1, 0x0000ff00);
-        tcg_gen_or_i64(t0, t0, t1);
-
-        tcg_gen_shri_i64(t1, arg, 56);
-        tcg_gen_or_i64(ret, t0, t1);
-        tcg_temp_free_i64(t0);
-        tcg_temp_free_i64(t1);
-    }
-}
-
-#endif
-
-static inline void tcg_gen_neg_i32(TCGv_i32 ret, TCGv_i32 arg)
-{
-    if (TCG_TARGET_HAS_neg_i32) {
-        tcg_gen_op2_i32(INDEX_op_neg_i32, ret, arg);
-    } else {
-        TCGv_i32 t0 = tcg_const_i32(0);
-        tcg_gen_sub_i32(ret, t0, arg);
-        tcg_temp_free_i32(t0);
-    }
-}
-
-static inline void tcg_gen_neg_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-    if (TCG_TARGET_HAS_neg_i64) {
-        tcg_gen_op2_i64(INDEX_op_neg_i64, ret, arg);
-    } else {
-        TCGv_i64 t0 = tcg_const_i64(0);
-        tcg_gen_sub_i64(ret, t0, arg);
-        tcg_temp_free_i64(t0);
-    }
-}
-
-static inline void tcg_gen_not_i32(TCGv_i32 ret, TCGv_i32 arg)
-{
-    if (TCG_TARGET_HAS_not_i32) {
-        tcg_gen_op2_i32(INDEX_op_not_i32, ret, arg);
-    } else {
-        tcg_gen_xori_i32(ret, arg, -1);
-    }
-}
-
-static inline void tcg_gen_not_i64(TCGv_i64 ret, TCGv_i64 arg)
-{
-#if TCG_TARGET_REG_BITS == 64
-    if (TCG_TARGET_HAS_not_i64) {
-        tcg_gen_op2_i64(INDEX_op_not_i64, ret, arg);
-    } else {
-        tcg_gen_xori_i64(ret, arg, -1);
-    }
-#else
-    tcg_gen_not_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_not_i32(TCGV_HIGH(ret), TCGV_HIGH(arg));
-#endif
-}
-
-static inline void tcg_gen_discard_i32(TCGv_i32 arg)
-{
-    tcg_gen_op1_i32(INDEX_op_discard, arg);
-}
-
-static inline void tcg_gen_discard_i64(TCGv_i64 arg)
-{
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_discard_i32(TCGV_LOW(arg));
-    tcg_gen_discard_i32(TCGV_HIGH(arg));
-#else
-    tcg_gen_op1_i64(INDEX_op_discard, arg);
-#endif
-}
-
-static inline void tcg_gen_andc_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
-{
-    if (TCG_TARGET_HAS_andc_i32) {
-        tcg_gen_op3_i32(INDEX_op_andc_i32, ret, arg1, arg2);
-    } else {
-        TCGv_i32 t0 = tcg_temp_new_i32();
-        tcg_gen_not_i32(t0, arg2);
-        tcg_gen_and_i32(ret, arg1, t0);
-        tcg_temp_free_i32(t0);
-    }
-}
-
-static inline void tcg_gen_andc_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-#if TCG_TARGET_REG_BITS == 64
-    if (TCG_TARGET_HAS_andc_i64) {
-        tcg_gen_op3_i64(INDEX_op_andc_i64, ret, arg1, arg2);
-    } else {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_not_i64(t0, arg2);
-        tcg_gen_and_i64(ret, arg1, t0);
-        tcg_temp_free_i64(t0);
-    }
-#else
-    tcg_gen_andc_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
-    tcg_gen_andc_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
-#endif
-}
-
-static inline void tcg_gen_eqv_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
-{
-    if (TCG_TARGET_HAS_eqv_i32) {
-        tcg_gen_op3_i32(INDEX_op_eqv_i32, ret, arg1, arg2);
-    } else {
-        tcg_gen_xor_i32(ret, arg1, arg2);
-        tcg_gen_not_i32(ret, ret);
-    }
-}
-
-static inline void tcg_gen_eqv_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-#if TCG_TARGET_REG_BITS == 64
-    if (TCG_TARGET_HAS_eqv_i64) {
-        tcg_gen_op3_i64(INDEX_op_eqv_i64, ret, arg1, arg2);
-    } else {
-        tcg_gen_xor_i64(ret, arg1, arg2);
-        tcg_gen_not_i64(ret, ret);
-    }
-#else
-    tcg_gen_eqv_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
-    tcg_gen_eqv_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
-#endif
-}
-
-static inline void tcg_gen_nand_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
-{
-    if (TCG_TARGET_HAS_nand_i32) {
-        tcg_gen_op3_i32(INDEX_op_nand_i32, ret, arg1, arg2);
-    } else {
-        tcg_gen_and_i32(ret, arg1, arg2);
-        tcg_gen_not_i32(ret, ret);
-    }
-}
-
-static inline void tcg_gen_nand_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-#if TCG_TARGET_REG_BITS == 64
-    if (TCG_TARGET_HAS_nand_i64) {
-        tcg_gen_op3_i64(INDEX_op_nand_i64, ret, arg1, arg2);
-    } else {
-        tcg_gen_and_i64(ret, arg1, arg2);
-        tcg_gen_not_i64(ret, ret);
-    }
-#else
-    tcg_gen_nand_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
-    tcg_gen_nand_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
-#endif
-}
-
-static inline void tcg_gen_nor_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
-{
-    if (TCG_TARGET_HAS_nor_i32) {
-        tcg_gen_op3_i32(INDEX_op_nor_i32, ret, arg1, arg2);
-    } else {
-        tcg_gen_or_i32(ret, arg1, arg2);
-        tcg_gen_not_i32(ret, ret);
-    }
-}
-
-static inline void tcg_gen_nor_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-#if TCG_TARGET_REG_BITS == 64
-    if (TCG_TARGET_HAS_nor_i64) {
-        tcg_gen_op3_i64(INDEX_op_nor_i64, ret, arg1, arg2);
-    } else {
-        tcg_gen_or_i64(ret, arg1, arg2);
-        tcg_gen_not_i64(ret, ret);
-    }
-#else
-    tcg_gen_nor_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
-    tcg_gen_nor_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
-#endif
-}
-
-static inline void tcg_gen_orc_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
-{
-    if (TCG_TARGET_HAS_orc_i32) {
-        tcg_gen_op3_i32(INDEX_op_orc_i32, ret, arg1, arg2);
-    } else {
-        TCGv_i32 t0 = tcg_temp_new_i32();
-        tcg_gen_not_i32(t0, arg2);
-        tcg_gen_or_i32(ret, arg1, t0);
-        tcg_temp_free_i32(t0);
-    }
-}
-
-static inline void tcg_gen_orc_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-#if TCG_TARGET_REG_BITS == 64
-    if (TCG_TARGET_HAS_orc_i64) {
-        tcg_gen_op3_i64(INDEX_op_orc_i64, ret, arg1, arg2);
-    } else {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_not_i64(t0, arg2);
-        tcg_gen_or_i64(ret, arg1, t0);
-        tcg_temp_free_i64(t0);
-    }
-#else
-    tcg_gen_orc_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
-    tcg_gen_orc_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
-#endif
-}
-
-static inline void tcg_gen_rotl_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
-{
-    if (TCG_TARGET_HAS_rot_i32) {
-        tcg_gen_op3_i32(INDEX_op_rotl_i32, ret, arg1, arg2);
-    } else {
-        TCGv_i32 t0, t1;
-
-        t0 = tcg_temp_new_i32();
-        t1 = tcg_temp_new_i32();
-        tcg_gen_shl_i32(t0, arg1, arg2);
-        tcg_gen_subfi_i32(t1, 32, arg2);
-        tcg_gen_shr_i32(t1, arg1, t1);
-        tcg_gen_or_i32(ret, t0, t1);
-        tcg_temp_free_i32(t0);
-        tcg_temp_free_i32(t1);
-    }
-}
-
-static inline void tcg_gen_rotl_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    if (TCG_TARGET_HAS_rot_i64) {
-        tcg_gen_op3_i64(INDEX_op_rotl_i64, ret, arg1, arg2);
-    } else {
-        TCGv_i64 t0, t1;
-        t0 = tcg_temp_new_i64();
-        t1 = tcg_temp_new_i64();
-        tcg_gen_shl_i64(t0, arg1, arg2);
-        tcg_gen_subfi_i64(t1, 64, arg2);
-        tcg_gen_shr_i64(t1, arg1, t1);
-        tcg_gen_or_i64(ret, t0, t1);
-        tcg_temp_free_i64(t0);
-        tcg_temp_free_i64(t1);
-    }
-}
-
-static inline void tcg_gen_rotli_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
-{
-    /* some cases can be optimized here */
-    if (arg2 == 0) {
-        tcg_gen_mov_i32(ret, arg1);
-    } else if (TCG_TARGET_HAS_rot_i32) {
-        TCGv_i32 t0 = tcg_const_i32(arg2);
-        tcg_gen_rotl_i32(ret, arg1, t0);
-        tcg_temp_free_i32(t0);
-    } else {
-        TCGv_i32 t0, t1;
-        t0 = tcg_temp_new_i32();
-        t1 = tcg_temp_new_i32();
-        tcg_gen_shli_i32(t0, arg1, arg2);
-        tcg_gen_shri_i32(t1, arg1, 32 - arg2);
-        tcg_gen_or_i32(ret, t0, t1);
-        tcg_temp_free_i32(t0);
-        tcg_temp_free_i32(t1);
-    }
-}
-
-static inline void tcg_gen_rotli_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    /* some cases can be optimized here */
-    if (arg2 == 0) {
-        tcg_gen_mov_i64(ret, arg1);
-    } else if (TCG_TARGET_HAS_rot_i64) {
-        TCGv_i64 t0 = tcg_const_i64(arg2);
-        tcg_gen_rotl_i64(ret, arg1, t0);
-        tcg_temp_free_i64(t0);
-    } else {
-        TCGv_i64 t0, t1;
-        t0 = tcg_temp_new_i64();
-        t1 = tcg_temp_new_i64();
-        tcg_gen_shli_i64(t0, arg1, arg2);
-        tcg_gen_shri_i64(t1, arg1, 64 - arg2);
-        tcg_gen_or_i64(ret, t0, t1);
-        tcg_temp_free_i64(t0);
-        tcg_temp_free_i64(t1);
-    }
-}
-
-static inline void tcg_gen_rotr_i32(TCGv_i32 ret, TCGv_i32 arg1, TCGv_i32 arg2)
-{
-    if (TCG_TARGET_HAS_rot_i32) {
-        tcg_gen_op3_i32(INDEX_op_rotr_i32, ret, arg1, arg2);
-    } else {
-        TCGv_i32 t0, t1;
-
-        t0 = tcg_temp_new_i32();
-        t1 = tcg_temp_new_i32();
-        tcg_gen_shr_i32(t0, arg1, arg2);
-        tcg_gen_subfi_i32(t1, 32, arg2);
-        tcg_gen_shl_i32(t1, arg1, t1);
-        tcg_gen_or_i32(ret, t0, t1);
-        tcg_temp_free_i32(t0);
-        tcg_temp_free_i32(t1);
-    }
-}
-
-static inline void tcg_gen_rotr_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    if (TCG_TARGET_HAS_rot_i64) {
-        tcg_gen_op3_i64(INDEX_op_rotr_i64, ret, arg1, arg2);
-    } else {
-        TCGv_i64 t0, t1;
-        t0 = tcg_temp_new_i64();
-        t1 = tcg_temp_new_i64();
-        tcg_gen_shr_i64(t0, arg1, arg2);
-        tcg_gen_subfi_i64(t1, 64, arg2);
-        tcg_gen_shl_i64(t1, arg1, t1);
-        tcg_gen_or_i64(ret, t0, t1);
-        tcg_temp_free_i64(t0);
-        tcg_temp_free_i64(t1);
-    }
-}
-
-static inline void tcg_gen_rotri_i32(TCGv_i32 ret, TCGv_i32 arg1, int32_t arg2)
-{
-    /* some cases can be optimized here */
-    if (arg2 == 0) {
-        tcg_gen_mov_i32(ret, arg1);
-    } else {
-        tcg_gen_rotli_i32(ret, arg1, 32 - arg2);
-    }
-}
-
-static inline void tcg_gen_rotri_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
-{
-    /* some cases can be optimized here */
-    if (arg2 == 0) {
-        tcg_gen_mov_i64(ret, arg1);
-    } else {
-        tcg_gen_rotli_i64(ret, arg1, 64 - arg2);
-    }
-}
-
-static inline void tcg_gen_deposit_i32(TCGv_i32 ret, TCGv_i32 arg1,
-                                       TCGv_i32 arg2, unsigned int ofs,
-                                       unsigned int len)
-{
-    uint32_t mask;
-    TCGv_i32 t1;
-
-    tcg_debug_assert(ofs < 32);
-    tcg_debug_assert(len <= 32);
-    tcg_debug_assert(ofs + len <= 32);
-
-    if (ofs == 0 && len == 32) {
-        tcg_gen_mov_i32(ret, arg2);
-        return;
-    }
-    if (TCG_TARGET_HAS_deposit_i32 && TCG_TARGET_deposit_i32_valid(ofs, len)) {
-        tcg_gen_op5ii_i32(INDEX_op_deposit_i32, ret, arg1, arg2, ofs, len);
-        return;
-    }
-
-    mask = (1u << len) - 1;
-    t1 = tcg_temp_new_i32();
-
-    if (ofs + len < 32) {
-        tcg_gen_andi_i32(t1, arg2, mask);
-        tcg_gen_shli_i32(t1, t1, ofs);
-    } else {
-        tcg_gen_shli_i32(t1, arg2, ofs);
-    }
-    tcg_gen_andi_i32(ret, arg1, ~(mask << ofs));
-    tcg_gen_or_i32(ret, ret, t1);
-
-    tcg_temp_free_i32(t1);
-}
-
-static inline void tcg_gen_deposit_i64(TCGv_i64 ret, TCGv_i64 arg1,
-                                       TCGv_i64 arg2, unsigned int ofs,
-                                       unsigned int len)
-{
-    uint64_t mask;
-    TCGv_i64 t1;
-
-    tcg_debug_assert(ofs < 64);
-    tcg_debug_assert(len <= 64);
-    tcg_debug_assert(ofs + len <= 64);
-
-    if (ofs == 0 && len == 64) {
-        tcg_gen_mov_i64(ret, arg2);
-        return;
-    }
-    if (TCG_TARGET_HAS_deposit_i64 && TCG_TARGET_deposit_i64_valid(ofs, len)) {
-        tcg_gen_op5ii_i64(INDEX_op_deposit_i64, ret, arg1, arg2, ofs, len);
-        return;
-    }
-
-#if TCG_TARGET_REG_BITS == 32
-    if (ofs >= 32) {
-        tcg_gen_deposit_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1),
-                            TCGV_LOW(arg2), ofs - 32, len);
-        tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg1));
-        return;
-    }
-    if (ofs + len <= 32) {
-        tcg_gen_deposit_i32(TCGV_LOW(ret), TCGV_LOW(arg1),
-                            TCGV_LOW(arg2), ofs, len);
-        tcg_gen_mov_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1));
-        return;
-    }
-#endif
-
-    mask = (1ull << len) - 1;
-    t1 = tcg_temp_new_i64();
-
-    if (ofs + len < 64) {
-        tcg_gen_andi_i64(t1, arg2, mask);
-        tcg_gen_shli_i64(t1, t1, ofs);
-    } else {
-        tcg_gen_shli_i64(t1, arg2, ofs);
-    }
-    tcg_gen_andi_i64(ret, arg1, ~(mask << ofs));
-    tcg_gen_or_i64(ret, ret, t1);
-
-    tcg_temp_free_i64(t1);
-}
-
-static inline void tcg_gen_concat_i32_i64(TCGv_i64 dest, TCGv_i32 low,
-                                          TCGv_i32 high)
-{
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_mov_i32(TCGV_LOW(dest), low);
-    tcg_gen_mov_i32(TCGV_HIGH(dest), high);
-#else
-    TCGv_i64 tmp = tcg_temp_new_i64();
-    /* These extensions are only needed for type correctness.
-       We may be able to do better given target specific information.  */
-    tcg_gen_extu_i32_i64(tmp, high);
-    tcg_gen_extu_i32_i64(dest, low);
-    /* If deposit is available, use it.  Otherwise use the extra
-       knowledge that we have of the zero-extensions above.  */
-    if (TCG_TARGET_HAS_deposit_i64 && TCG_TARGET_deposit_i64_valid(32, 32)) {
-        tcg_gen_deposit_i64(dest, dest, tmp, 32, 32);
-    } else {
-        tcg_gen_shli_i64(tmp, tmp, 32);
-        tcg_gen_or_i64(dest, dest, tmp);
-    }
-    tcg_temp_free_i64(tmp);
-#endif
-}
-
-static inline void tcg_gen_concat32_i64(TCGv_i64 dest, TCGv_i64 low,
-                                        TCGv_i64 high)
-{
-    tcg_gen_deposit_i64(dest, low, high, 32, 32);
-}
-
-static inline void tcg_gen_trunc_i64_i32(TCGv_i32 ret, TCGv_i64 arg)
-{
-    tcg_gen_trunc_shr_i64_i32(ret, arg, 0);
-}
-
-static inline void tcg_gen_extr_i64_i32(TCGv_i32 lo, TCGv_i32 hi, TCGv_i64 arg)
-{
-    tcg_gen_trunc_shr_i64_i32(lo, arg, 0);
-    tcg_gen_trunc_shr_i64_i32(hi, arg, 32);
-}
-
-static inline void tcg_gen_extr32_i64(TCGv_i64 lo, TCGv_i64 hi, TCGv_i64 arg)
-{
-    tcg_gen_ext32u_i64(lo, arg);
-    tcg_gen_shri_i64(hi, arg, 32);
-}
-
-static inline void tcg_gen_movcond_i32(TCGCond cond, TCGv_i32 ret,
-                                       TCGv_i32 c1, TCGv_i32 c2,
-                                       TCGv_i32 v1, TCGv_i32 v2)
-{
-    if (TCG_TARGET_HAS_movcond_i32) {
-        tcg_gen_op6i_i32(INDEX_op_movcond_i32, ret, c1, c2, v1, v2, cond);
-    } else {
-        TCGv_i32 t0 = tcg_temp_new_i32();
-        TCGv_i32 t1 = tcg_temp_new_i32();
-        tcg_gen_setcond_i32(cond, t0, c1, c2);
-        tcg_gen_neg_i32(t0, t0);
-        tcg_gen_and_i32(t1, v1, t0);
-        tcg_gen_andc_i32(ret, v2, t0);
-        tcg_gen_or_i32(ret, ret, t1);
-        tcg_temp_free_i32(t0);
-        tcg_temp_free_i32(t1);
-    }
-}
-
-static inline void tcg_gen_movcond_i64(TCGCond cond, TCGv_i64 ret,
-                                       TCGv_i64 c1, TCGv_i64 c2,
-                                       TCGv_i64 v1, TCGv_i64 v2)
-{
-#if TCG_TARGET_REG_BITS == 32
-    TCGv_i32 t0 = tcg_temp_new_i32();
-    TCGv_i32 t1 = tcg_temp_new_i32();
-    tcg_gen_op6i_i32(INDEX_op_setcond2_i32, t0,
-                     TCGV_LOW(c1), TCGV_HIGH(c1),
-                     TCGV_LOW(c2), TCGV_HIGH(c2), cond);
-
-    if (TCG_TARGET_HAS_movcond_i32) {
-        tcg_gen_movi_i32(t1, 0);
-        tcg_gen_movcond_i32(TCG_COND_NE, TCGV_LOW(ret), t0, t1,
-                            TCGV_LOW(v1), TCGV_LOW(v2));
-        tcg_gen_movcond_i32(TCG_COND_NE, TCGV_HIGH(ret), t0, t1,
-                            TCGV_HIGH(v1), TCGV_HIGH(v2));
-    } else {
-        tcg_gen_neg_i32(t0, t0);
-
-        tcg_gen_and_i32(t1, TCGV_LOW(v1), t0);
-        tcg_gen_andc_i32(TCGV_LOW(ret), TCGV_LOW(v2), t0);
-        tcg_gen_or_i32(TCGV_LOW(ret), TCGV_LOW(ret), t1);
-
-        tcg_gen_and_i32(t1, TCGV_HIGH(v1), t0);
-        tcg_gen_andc_i32(TCGV_HIGH(ret), TCGV_HIGH(v2), t0);
-        tcg_gen_or_i32(TCGV_HIGH(ret), TCGV_HIGH(ret), t1);
-    }
-    tcg_temp_free_i32(t0);
-    tcg_temp_free_i32(t1);
-#else
-    if (TCG_TARGET_HAS_movcond_i64) {
-        tcg_gen_op6i_i64(INDEX_op_movcond_i64, ret, c1, c2, v1, v2, cond);
-    } else {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        TCGv_i64 t1 = tcg_temp_new_i64();
-        tcg_gen_setcond_i64(cond, t0, c1, c2);
-        tcg_gen_neg_i64(t0, t0);
-        tcg_gen_and_i64(t1, v1, t0);
-        tcg_gen_andc_i64(ret, v2, t0);
-        tcg_gen_or_i64(ret, ret, t1);
-        tcg_temp_free_i64(t0);
-        tcg_temp_free_i64(t1);
-    }
-#endif
-}
-
-static inline void tcg_gen_add2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 al,
-                                    TCGv_i32 ah, TCGv_i32 bl, TCGv_i32 bh)
-{
-    if (TCG_TARGET_HAS_add2_i32) {
-        tcg_gen_op6_i32(INDEX_op_add2_i32, rl, rh, al, ah, bl, bh);
-        /* Allow the optimizer room to replace add2 with two moves.  */
-        tcg_gen_op0(INDEX_op_nop);
-    } else {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        TCGv_i64 t1 = tcg_temp_new_i64();
-        tcg_gen_concat_i32_i64(t0, al, ah);
-        tcg_gen_concat_i32_i64(t1, bl, bh);
-        tcg_gen_add_i64(t0, t0, t1);
-        tcg_gen_extr_i64_i32(rl, rh, t0);
-        tcg_temp_free_i64(t0);
-        tcg_temp_free_i64(t1);
-    }
-}
-
-static inline void tcg_gen_sub2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 al,
-                                    TCGv_i32 ah, TCGv_i32 bl, TCGv_i32 bh)
-{
-    if (TCG_TARGET_HAS_sub2_i32) {
-        tcg_gen_op6_i32(INDEX_op_sub2_i32, rl, rh, al, ah, bl, bh);
-        /* Allow the optimizer room to replace sub2 with two moves.  */
-        tcg_gen_op0(INDEX_op_nop);
-    } else {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        TCGv_i64 t1 = tcg_temp_new_i64();
-        tcg_gen_concat_i32_i64(t0, al, ah);
-        tcg_gen_concat_i32_i64(t1, bl, bh);
-        tcg_gen_sub_i64(t0, t0, t1);
-        tcg_gen_extr_i64_i32(rl, rh, t0);
-        tcg_temp_free_i64(t0);
-        tcg_temp_free_i64(t1);
-    }
-}
-
-static inline void tcg_gen_mulu2_i32(TCGv_i32 rl, TCGv_i32 rh,
-                                     TCGv_i32 arg1, TCGv_i32 arg2)
-{
-    if (TCG_TARGET_HAS_mulu2_i32) {
-        tcg_gen_op4_i32(INDEX_op_mulu2_i32, rl, rh, arg1, arg2);
-        /* Allow the optimizer room to replace mulu2 with two moves.  */
-        tcg_gen_op0(INDEX_op_nop);
-    } else if (TCG_TARGET_HAS_muluh_i32) {
-        TCGv_i32 t = tcg_temp_new_i32();
-        tcg_gen_op3_i32(INDEX_op_mul_i32, t, arg1, arg2);
-        tcg_gen_op3_i32(INDEX_op_muluh_i32, rh, arg1, arg2);
-        tcg_gen_mov_i32(rl, t);
-        tcg_temp_free_i32(t);
-    } else {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        TCGv_i64 t1 = tcg_temp_new_i64();
-        tcg_gen_extu_i32_i64(t0, arg1);
-        tcg_gen_extu_i32_i64(t1, arg2);
-        tcg_gen_mul_i64(t0, t0, t1);
-        tcg_gen_extr_i64_i32(rl, rh, t0);
-        tcg_temp_free_i64(t0);
-        tcg_temp_free_i64(t1);
-    }
-}
-
-static inline void tcg_gen_muls2_i32(TCGv_i32 rl, TCGv_i32 rh,
-                                     TCGv_i32 arg1, TCGv_i32 arg2)
-{
-    if (TCG_TARGET_HAS_muls2_i32) {
-        tcg_gen_op4_i32(INDEX_op_muls2_i32, rl, rh, arg1, arg2);
-        /* Allow the optimizer room to replace muls2 with two moves.  */
-        tcg_gen_op0(INDEX_op_nop);
-    } else if (TCG_TARGET_HAS_mulsh_i32) {
-        TCGv_i32 t = tcg_temp_new_i32();
-        tcg_gen_op3_i32(INDEX_op_mul_i32, t, arg1, arg2);
-        tcg_gen_op3_i32(INDEX_op_mulsh_i32, rh, arg1, arg2);
-        tcg_gen_mov_i32(rl, t);
-        tcg_temp_free_i32(t);
-    } else if (TCG_TARGET_REG_BITS == 32) {
-        TCGv_i32 t0 = tcg_temp_new_i32();
-        TCGv_i32 t1 = tcg_temp_new_i32();
-        TCGv_i32 t2 = tcg_temp_new_i32();
-        TCGv_i32 t3 = tcg_temp_new_i32();
-        tcg_gen_mulu2_i32(t0, t1, arg1, arg2);
-        /* Adjust for negative inputs.  */
-        tcg_gen_sari_i32(t2, arg1, 31);
-        tcg_gen_sari_i32(t3, arg2, 31);
-        tcg_gen_and_i32(t2, t2, arg2);
-        tcg_gen_and_i32(t3, t3, arg1);
-        tcg_gen_sub_i32(rh, t1, t2);
-        tcg_gen_sub_i32(rh, rh, t3);
-        tcg_gen_mov_i32(rl, t0);
-        tcg_temp_free_i32(t0);
-        tcg_temp_free_i32(t1);
-        tcg_temp_free_i32(t2);
-        tcg_temp_free_i32(t3);
-    } else {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        TCGv_i64 t1 = tcg_temp_new_i64();
-        tcg_gen_ext_i32_i64(t0, arg1);
-        tcg_gen_ext_i32_i64(t1, arg2);
-        tcg_gen_mul_i64(t0, t0, t1);
-        tcg_gen_extr_i64_i32(rl, rh, t0);
-        tcg_temp_free_i64(t0);
-        tcg_temp_free_i64(t1);
-    }
-}
-
-static inline void tcg_gen_add2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 al,
-                                    TCGv_i64 ah, TCGv_i64 bl, TCGv_i64 bh)
-{
-    if (TCG_TARGET_HAS_add2_i64) {
-        tcg_gen_op6_i64(INDEX_op_add2_i64, rl, rh, al, ah, bl, bh);
-        /* Allow the optimizer room to replace add2 with two moves.  */
-        tcg_gen_op0(INDEX_op_nop);
-    } else {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        TCGv_i64 t1 = tcg_temp_new_i64();
-        tcg_gen_add_i64(t0, al, bl);
-        tcg_gen_setcond_i64(TCG_COND_LTU, t1, t0, al);
-        tcg_gen_add_i64(rh, ah, bh);
-        tcg_gen_add_i64(rh, rh, t1);
-        tcg_gen_mov_i64(rl, t0);
-        tcg_temp_free_i64(t0);
-        tcg_temp_free_i64(t1);
-    }
-}
-
-static inline void tcg_gen_sub2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 al,
-                                    TCGv_i64 ah, TCGv_i64 bl, TCGv_i64 bh)
-{
-    if (TCG_TARGET_HAS_sub2_i64) {
-        tcg_gen_op6_i64(INDEX_op_sub2_i64, rl, rh, al, ah, bl, bh);
-        /* Allow the optimizer room to replace sub2 with two moves.  */
-        tcg_gen_op0(INDEX_op_nop);
-    } else {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        TCGv_i64 t1 = tcg_temp_new_i64();
-        tcg_gen_sub_i64(t0, al, bl);
-        tcg_gen_setcond_i64(TCG_COND_LTU, t1, al, bl);
-        tcg_gen_sub_i64(rh, ah, bh);
-        tcg_gen_sub_i64(rh, rh, t1);
-        tcg_gen_mov_i64(rl, t0);
-        tcg_temp_free_i64(t0);
-        tcg_temp_free_i64(t1);
-    }
-}
-
-static inline void tcg_gen_mulu2_i64(TCGv_i64 rl, TCGv_i64 rh,
-                                     TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    if (TCG_TARGET_HAS_mulu2_i64) {
-        tcg_gen_op4_i64(INDEX_op_mulu2_i64, rl, rh, arg1, arg2);
-        /* Allow the optimizer room to replace mulu2 with two moves.  */
-        tcg_gen_op0(INDEX_op_nop);
-    } else if (TCG_TARGET_HAS_muluh_i64) {
-        TCGv_i64 t = tcg_temp_new_i64();
-        tcg_gen_op3_i64(INDEX_op_mul_i64, t, arg1, arg2);
-        tcg_gen_op3_i64(INDEX_op_muluh_i64, rh, arg1, arg2);
-        tcg_gen_mov_i64(rl, t);
-        tcg_temp_free_i64(t);
-    } else {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_mul_i64(t0, arg1, arg2);
-        gen_helper_muluh_i64(rh, arg1, arg2);
-        tcg_gen_mov_i64(rl, t0);
-        tcg_temp_free_i64(t0);
-    }
-}
-
-static inline void tcg_gen_muls2_i64(TCGv_i64 rl, TCGv_i64 rh,
-                                     TCGv_i64 arg1, TCGv_i64 arg2)
-{
-    if (TCG_TARGET_HAS_muls2_i64) {
-        tcg_gen_op4_i64(INDEX_op_muls2_i64, rl, rh, arg1, arg2);
-        /* Allow the optimizer room to replace muls2 with two moves.  */
-        tcg_gen_op0(INDEX_op_nop);
-    } else if (TCG_TARGET_HAS_mulsh_i64) {
-        TCGv_i64 t = tcg_temp_new_i64();
-        tcg_gen_op3_i64(INDEX_op_mul_i64, t, arg1, arg2);
-        tcg_gen_op3_i64(INDEX_op_mulsh_i64, rh, arg1, arg2);
-        tcg_gen_mov_i64(rl, t);
-        tcg_temp_free_i64(t);
-    } else if (TCG_TARGET_HAS_mulu2_i64 || TCG_TARGET_HAS_muluh_i64) {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        TCGv_i64 t1 = tcg_temp_new_i64();
-        TCGv_i64 t2 = tcg_temp_new_i64();
-        TCGv_i64 t3 = tcg_temp_new_i64();
-        tcg_gen_mulu2_i64(t0, t1, arg1, arg2);
-        /* Adjust for negative inputs.  */
-        tcg_gen_sari_i64(t2, arg1, 63);
-        tcg_gen_sari_i64(t3, arg2, 63);
-        tcg_gen_and_i64(t2, t2, arg2);
-        tcg_gen_and_i64(t3, t3, arg1);
-        tcg_gen_sub_i64(rh, t1, t2);
-        tcg_gen_sub_i64(rh, rh, t3);
-        tcg_gen_mov_i64(rl, t0);
-        tcg_temp_free_i64(t0);
-        tcg_temp_free_i64(t1);
-        tcg_temp_free_i64(t2);
-        tcg_temp_free_i64(t3);
-    } else {
-        TCGv_i64 t0 = tcg_temp_new_i64();
-        tcg_gen_mul_i64(t0, arg1, arg2);
-        gen_helper_mulsh_i64(rh, arg1, arg2);
-        tcg_gen_mov_i64(rl, t0);
-        tcg_temp_free_i64(t0);
-    }
-}
-
-/***************************************/
-/* QEMU specific operations. Their type depend on the QEMU CPU
-   type. */
-#ifndef TARGET_LONG_BITS
-#error must include QEMU headers
-#endif
+void tcg_gen_goto_tb(unsigned idx);
 
 #if TARGET_LONG_BITS == 32
 #define TCGv TCGv_i32
@@ -2473,7 +737,6 @@ static inline void tcg_gen_muls2_i64(TCGv_i64 rl, TCGv_i64 rh,
 #define TCGV_UNUSED(x) TCGV_UNUSED_I32(x)
 #define TCGV_IS_UNUSED(x) TCGV_IS_UNUSED_I32(x)
 #define TCGV_EQUAL(a, b) TCGV_EQUAL_I32(a, b)
-#define tcg_add_param_tl tcg_add_param_i32
 #define tcg_gen_qemu_ld_tl tcg_gen_qemu_ld_i32
 #define tcg_gen_qemu_st_tl tcg_gen_qemu_st_i32
 #else
@@ -2486,41 +749,10 @@ static inline void tcg_gen_muls2_i64(TCGv_i64 rl, TCGv_i64 rh,
 #define TCGV_UNUSED(x) TCGV_UNUSED_I64(x)
 #define TCGV_IS_UNUSED(x) TCGV_IS_UNUSED_I64(x)
 #define TCGV_EQUAL(a, b) TCGV_EQUAL_I64(a, b)
-#define tcg_add_param_tl tcg_add_param_i64
 #define tcg_gen_qemu_ld_tl tcg_gen_qemu_ld_i64
 #define tcg_gen_qemu_st_tl tcg_gen_qemu_st_i64
 #endif
 
-/* debug info: write the PC of the corresponding QEMU CPU instruction */
-static inline void tcg_gen_debug_insn_start(uint64_t pc)
-{
-    /* XXX: must really use a 32 bit size for TCGArg in all cases */
-#if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
-    tcg_gen_op2ii(INDEX_op_debug_insn_start, 
-                  (uint32_t)(pc), (uint32_t)(pc >> 32));
-#else
-    tcg_gen_op1i(INDEX_op_debug_insn_start, pc);
-#endif
-}
-
-static inline void tcg_gen_exit_tb(uintptr_t val)
-{
-    tcg_gen_op1i(INDEX_op_exit_tb, val);
-}
-
-static inline void tcg_gen_goto_tb(unsigned idx)
-{
-    /* We only support two chained exits.  */
-    tcg_debug_assert(idx <= 1);
-#ifdef CONFIG_DEBUG_TCG
-    /* Verify that we havn't seen this numbered exit before.  */
-    tcg_debug_assert((tcg_ctx.goto_tb_issue_mask & (1 << idx)) == 0);
-    tcg_ctx.goto_tb_issue_mask |= 1 << idx;
-#endif
-    tcg_gen_op1i(INDEX_op_goto_tb, idx);
-}
-
-
 void tcg_gen_qemu_ld_i32(TCGv_i32, TCGv, TCGArg, TCGMemOp);
 void tcg_gen_qemu_st_i32(TCGv_i32, TCGv, TCGArg, TCGMemOp);
 void tcg_gen_qemu_ld_i64(TCGv_i64, TCGv, TCGArg, TCGMemOp);
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 7a84b87..ae9811f 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -870,143 +870,6 @@ void tcg_gen_callN(TCGContext *s, void *func, TCGArg ret,
 #endif /* TCG_TARGET_EXTEND_ARGS */
 }
 
-#if TCG_TARGET_REG_BITS == 32
-void tcg_gen_shifti_i64(TCGv_i64 ret, TCGv_i64 arg1,
-                        int c, int right, int arith)
-{
-    if (c == 0) {
-        tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg1));
-        tcg_gen_mov_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1));
-    } else if (c >= 32) {
-        c -= 32;
-        if (right) {
-            if (arith) {
-                tcg_gen_sari_i32(TCGV_LOW(ret), TCGV_HIGH(arg1), c);
-                tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), 31);
-            } else {
-                tcg_gen_shri_i32(TCGV_LOW(ret), TCGV_HIGH(arg1), c);
-                tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
-            }
-        } else {
-            tcg_gen_shli_i32(TCGV_HIGH(ret), TCGV_LOW(arg1), c);
-            tcg_gen_movi_i32(TCGV_LOW(ret), 0);
-        }
-    } else {
-        TCGv_i32 t0, t1;
-
-        t0 = tcg_temp_new_i32();
-        t1 = tcg_temp_new_i32();
-        if (right) {
-            tcg_gen_shli_i32(t0, TCGV_HIGH(arg1), 32 - c);
-            if (arith)
-                tcg_gen_sari_i32(t1, TCGV_HIGH(arg1), c);
-            else
-                tcg_gen_shri_i32(t1, TCGV_HIGH(arg1), c);
-            tcg_gen_shri_i32(TCGV_LOW(ret), TCGV_LOW(arg1), c);
-            tcg_gen_or_i32(TCGV_LOW(ret), TCGV_LOW(ret), t0);
-            tcg_gen_mov_i32(TCGV_HIGH(ret), t1);
-        } else {
-            tcg_gen_shri_i32(t0, TCGV_LOW(arg1), 32 - c);
-            /* Note: ret can be the same as arg1, so we use t1 */
-            tcg_gen_shli_i32(t1, TCGV_LOW(arg1), c);
-            tcg_gen_shli_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), c);
-            tcg_gen_or_i32(TCGV_HIGH(ret), TCGV_HIGH(ret), t0);
-            tcg_gen_mov_i32(TCGV_LOW(ret), t1);
-        }
-        tcg_temp_free_i32(t0);
-        tcg_temp_free_i32(t1);
-    }
-}
-#endif
-
-static inline TCGMemOp tcg_canonicalize_memop(TCGMemOp op, bool is64, bool st)
-{
-    switch (op & MO_SIZE) {
-    case MO_8:
-        op &= ~MO_BSWAP;
-        break;
-    case MO_16:
-        break;
-    case MO_32:
-        if (!is64) {
-            op &= ~MO_SIGN;
-        }
-        break;
-    case MO_64:
-        if (!is64) {
-            tcg_abort();
-        }
-        break;
-    }
-    if (st) {
-        op &= ~MO_SIGN;
-    }
-    return op;
-}
-
-void tcg_gen_qemu_ld_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
-{
-    memop = tcg_canonicalize_memop(memop, 0, 0);
-
-    *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_ld_i32;
-    tcg_add_param_i32(val);
-    tcg_add_param_tl(addr);
-    *tcg_ctx.gen_opparam_ptr++ = memop;
-    *tcg_ctx.gen_opparam_ptr++ = idx;
-}
-
-void tcg_gen_qemu_st_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
-{
-    memop = tcg_canonicalize_memop(memop, 0, 1);
-
-    *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_st_i32;
-    tcg_add_param_i32(val);
-    tcg_add_param_tl(addr);
-    *tcg_ctx.gen_opparam_ptr++ = memop;
-    *tcg_ctx.gen_opparam_ptr++ = idx;
-}
-
-void tcg_gen_qemu_ld_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
-{
-    memop = tcg_canonicalize_memop(memop, 1, 0);
-
-#if TCG_TARGET_REG_BITS == 32
-    if ((memop & MO_SIZE) < MO_64) {
-        tcg_gen_qemu_ld_i32(TCGV_LOW(val), addr, idx, memop);
-        if (memop & MO_SIGN) {
-            tcg_gen_sari_i32(TCGV_HIGH(val), TCGV_LOW(val), 31);
-        } else {
-            tcg_gen_movi_i32(TCGV_HIGH(val), 0);
-        }
-        return;
-    }
-#endif
-
-    *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_ld_i64;
-    tcg_add_param_i64(val);
-    tcg_add_param_tl(addr);
-    *tcg_ctx.gen_opparam_ptr++ = memop;
-    *tcg_ctx.gen_opparam_ptr++ = idx;
-}
-
-void tcg_gen_qemu_st_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
-{
-    memop = tcg_canonicalize_memop(memop, 1, 1);
-
-#if TCG_TARGET_REG_BITS == 32
-    if ((memop & MO_SIZE) < MO_64) {
-        tcg_gen_qemu_st_i32(TCGV_LOW(val), addr, idx, memop);
-        return;
-    }
-#endif
-
-    *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_st_i64;
-    tcg_add_param_i64(val);
-    tcg_add_param_tl(addr);
-    *tcg_ctx.gen_opparam_ptr++ = memop;
-    *tcg_ctx.gen_opparam_ptr++ = idx;
-}
-
 static void tcg_reg_alloc_start(TCGContext *s)
 {
     int i;
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 7285f71..f4b9033 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -705,9 +705,6 @@ void tcg_add_target_add_op_defs(const TCGTargetOpDef *tdefs);
 void tcg_gen_callN(TCGContext *s, void *func,
                    TCGArg ret, int nargs, TCGArg *args);
 
-void tcg_gen_shifti_i64(TCGv_i64 ret, TCGv_i64 arg1,
-                        int c, int right, int arith);
-
 TCGArg *tcg_optimize(TCGContext *s, uint16_t *tcg_opc_ptr, TCGArg *args,
                      TCGOpDef *tcg_op_def);
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2.3 2/8] tcg: Reduce ifdefs in tcg-op.c
  2014-11-11 16:24 [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Richard Henderson
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 1/8] tcg: Move some opcode generation functions out of line Richard Henderson
@ 2014-11-11 16:24 ` Richard Henderson
  2014-11-14 18:20   ` Bastian Koppelmann
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 3/8] tcg: Move emit of INDEX_op_end into gen_tb_end Richard Henderson
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Richard Henderson @ 2014-11-11 16:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: aurelien

Almost completely eliminates the ifdefs in this file, improving
confidence in the lesser used 32-bit builds.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/tcg-op.c | 449 +++++++++++++++++++++++++++--------------------------------
 1 file changed, 207 insertions(+), 242 deletions(-)

diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index a6fd0a6..5305f1d 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -25,6 +25,15 @@
 #include "tcg.h"
 #include "tcg-op.h"
 
+/* Reduce the number of ifdefs below.  This assumes that all uses of
+   TCGV_HIGH and TCGV_LOW are properly protected by a conditional that
+   the compiler can eliminate.  */
+#if TCG_TARGET_REG_BITS == 64
+extern TCGv_i32 TCGV_LOW_link_error(TCGv_i64);
+extern TCGv_i32 TCGV_HIGH_link_error(TCGv_i64);
+#define TCGV_LOW  TCGV_LOW_link_error
+#define TCGV_HIGH TCGV_HIGH_link_error
+#endif
 
 void tcg_gen_op0(TCGContext *ctx, TCGOpcode opc)
 {
@@ -901,11 +910,14 @@ void tcg_gen_subi_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
 
 void tcg_gen_andi_i64(TCGv_i64 ret, TCGv_i64 arg1, uint64_t arg2)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_andi_i32(TCGV_LOW(ret), TCGV_LOW(arg1), arg2);
-    tcg_gen_andi_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), arg2 >> 32);
-#else
     TCGv_i64 t0;
+
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_andi_i32(TCGV_LOW(ret), TCGV_LOW(arg1), arg2);
+        tcg_gen_andi_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), arg2 >> 32);
+        return;
+    }
+
     /* Some cases can be optimized here.  */
     switch (arg2) {
     case 0:
@@ -937,15 +949,15 @@ void tcg_gen_andi_i64(TCGv_i64 ret, TCGv_i64 arg1, uint64_t arg2)
     t0 = tcg_const_i64(arg2);
     tcg_gen_and_i64(ret, arg1, t0);
     tcg_temp_free_i64(t0);
-#endif
 }
 
 void tcg_gen_ori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_ori_i32(TCGV_LOW(ret), TCGV_LOW(arg1), arg2);
-    tcg_gen_ori_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), arg2 >> 32);
-#else
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_ori_i32(TCGV_LOW(ret), TCGV_LOW(arg1), arg2);
+        tcg_gen_ori_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), arg2 >> 32);
+        return;
+    }
     /* Some cases can be optimized here.  */
     if (arg2 == -1) {
         tcg_gen_movi_i64(ret, -1);
@@ -956,15 +968,15 @@ void tcg_gen_ori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
         tcg_gen_or_i64(ret, arg1, t0);
         tcg_temp_free_i64(t0);
     }
-#endif
 }
 
 void tcg_gen_xori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_xori_i32(TCGV_LOW(ret), TCGV_LOW(arg1), arg2);
-    tcg_gen_xori_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), arg2 >> 32);
-#else
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_xori_i32(TCGV_LOW(ret), TCGV_LOW(arg1), arg2);
+        tcg_gen_xori_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), arg2 >> 32);
+        return;
+    }
     /* Some cases can be optimized here.  */
     if (arg2 == 0) {
         tcg_gen_mov_i64(ret, arg1);
@@ -976,10 +988,8 @@ void tcg_gen_xori_i64(TCGv_i64 ret, TCGv_i64 arg1, int64_t arg2)
         tcg_gen_xor_i64(ret, arg1, t0);
         tcg_temp_free_i64(t0);
     }
-#endif
 }
 
-#if TCG_TARGET_REG_BITS == 32
 static inline void tcg_gen_shifti_i64(TCGv_i64 ret, TCGv_i64 arg1,
                                       unsigned c, bool right, bool arith)
 {
@@ -1031,23 +1041,10 @@ static inline void tcg_gen_shifti_i64(TCGv_i64 ret, TCGv_i64 arg1,
 
 void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
 {
-    tcg_gen_shifti_i64(ret, arg1, arg2, 0, 0);
-}
-
-void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
-{
-    tcg_gen_shifti_i64(ret, arg1, arg2, 1, 0);
-}
-
-void tcg_gen_sari_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
-{
-    tcg_gen_shifti_i64(ret, arg1, arg2, 1, 1);
-}
-#else /* TCG_TARGET_REG_SIZE == 64 */
-void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
-{
     tcg_debug_assert(arg2 < 64);
-    if (arg2 == 0) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_shifti_i64(ret, arg1, arg2, 0, 0);
+    } else if (arg2 == 0) {
         tcg_gen_mov_i64(ret, arg1);
     } else {
         TCGv_i64 t0 = tcg_const_i64(arg2);
@@ -1059,7 +1056,9 @@ void tcg_gen_shli_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
 void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
 {
     tcg_debug_assert(arg2 < 64);
-    if (arg2 == 0) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_shifti_i64(ret, arg1, arg2, 1, 0);
+    } else if (arg2 == 0) {
         tcg_gen_mov_i64(ret, arg1);
     } else {
         TCGv_i64 t0 = tcg_const_i64(arg2);
@@ -1071,7 +1070,9 @@ void tcg_gen_shri_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
 void tcg_gen_sari_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
 {
     tcg_debug_assert(arg2 < 64);
-    if (arg2 == 0) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_shifti_i64(ret, arg1, arg2, 1, 1);
+    } else if (arg2 == 0) {
         tcg_gen_mov_i64(ret, arg1);
     } else {
         TCGv_i64 t0 = tcg_const_i64(arg2);
@@ -1079,20 +1080,19 @@ void tcg_gen_sari_i64(TCGv_i64 ret, TCGv_i64 arg1, unsigned arg2)
         tcg_temp_free_i64(t0);
     }
 }
-#endif /* TCG_TARGET_REG_SIZE */
 
 void tcg_gen_brcond_i64(TCGCond cond, TCGv_i64 arg1, TCGv_i64 arg2, int label)
 {
     if (cond == TCG_COND_ALWAYS) {
         tcg_gen_br(label);
     } else if (cond != TCG_COND_NEVER) {
-#if TCG_TARGET_REG_BITS == 32
-        tcg_gen_op6ii_i32(INDEX_op_brcond2_i32, TCGV_LOW(arg1),
-                          TCGV_HIGH(arg1), TCGV_LOW(arg2),
-                          TCGV_HIGH(arg2), cond, label);
-#else
-        tcg_gen_op4ii_i64(INDEX_op_brcond_i64, arg1, arg2, cond, label);
-#endif
+        if (TCG_TARGET_REG_BITS == 32) {
+            tcg_gen_op6ii_i32(INDEX_op_brcond2_i32, TCGV_LOW(arg1),
+                              TCGV_HIGH(arg1), TCGV_LOW(arg2),
+                              TCGV_HIGH(arg2), cond, label);
+        } else {
+            tcg_gen_op4ii_i64(INDEX_op_brcond_i64, arg1, arg2, cond, label);
+        }
     }
 }
 
@@ -1115,14 +1115,14 @@ void tcg_gen_setcond_i64(TCGCond cond, TCGv_i64 ret,
     } else if (cond == TCG_COND_NEVER) {
         tcg_gen_movi_i64(ret, 0);
     } else {
-#if TCG_TARGET_REG_BITS == 32
-        tcg_gen_op6i_i32(INDEX_op_setcond2_i32, TCGV_LOW(ret),
-                         TCGV_LOW(arg1), TCGV_HIGH(arg1),
-                         TCGV_LOW(arg2), TCGV_HIGH(arg2), cond);
-        tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
-#else
-        tcg_gen_op4i_i64(INDEX_op_setcond_i64, ret, arg1, arg2, cond);
-#endif
+        if (TCG_TARGET_REG_BITS == 32) {
+            tcg_gen_op6i_i32(INDEX_op_setcond2_i32, TCGV_LOW(ret),
+                             TCGV_LOW(arg1), TCGV_HIGH(arg1),
+                             TCGV_LOW(arg2), TCGV_HIGH(arg2), cond);
+            tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+        } else {
+            tcg_gen_op4i_i64(INDEX_op_setcond_i64, ret, arg1, arg2, cond);
+        }
     }
 }
 
@@ -1211,99 +1211,86 @@ void tcg_gen_remu_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 
 void tcg_gen_ext8s_i64(TCGv_i64 ret, TCGv_i64 arg)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_ext8s_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
-#else
-    if (TCG_TARGET_HAS_ext8s_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_ext8s_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+        tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+    } else if (TCG_TARGET_HAS_ext8s_i64) {
         tcg_gen_op2_i64(INDEX_op_ext8s_i64, ret, arg);
     } else {
         tcg_gen_shli_i64(ret, arg, 56);
         tcg_gen_sari_i64(ret, ret, 56);
     }
-#endif
 }
 
 void tcg_gen_ext16s_i64(TCGv_i64 ret, TCGv_i64 arg)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_ext16s_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
-#else
-    if (TCG_TARGET_HAS_ext16s_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_ext16s_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+        tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+    } else if (TCG_TARGET_HAS_ext16s_i64) {
         tcg_gen_op2_i64(INDEX_op_ext16s_i64, ret, arg);
     } else {
         tcg_gen_shli_i64(ret, arg, 48);
         tcg_gen_sari_i64(ret, ret, 48);
     }
-#endif
 }
 
 void tcg_gen_ext32s_i64(TCGv_i64 ret, TCGv_i64 arg)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
-#else
-    if (TCG_TARGET_HAS_ext32s_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+        tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+    } else if (TCG_TARGET_HAS_ext32s_i64) {
         tcg_gen_op2_i64(INDEX_op_ext32s_i64, ret, arg);
     } else {
         tcg_gen_shli_i64(ret, arg, 32);
         tcg_gen_sari_i64(ret, ret, 32);
     }
-#endif
 }
 
 void tcg_gen_ext8u_i64(TCGv_i64 ret, TCGv_i64 arg)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_ext8u_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
-#else
-    if (TCG_TARGET_HAS_ext8u_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_ext8u_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+        tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+    } else if (TCG_TARGET_HAS_ext8u_i64) {
         tcg_gen_op2_i64(INDEX_op_ext8u_i64, ret, arg);
     } else {
         tcg_gen_andi_i64(ret, arg, 0xffu);
     }
-#endif
 }
 
 void tcg_gen_ext16u_i64(TCGv_i64 ret, TCGv_i64 arg)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_ext16u_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
-#else
-    if (TCG_TARGET_HAS_ext16u_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_ext16u_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+        tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+    } else if (TCG_TARGET_HAS_ext16u_i64) {
         tcg_gen_op2_i64(INDEX_op_ext16u_i64, ret, arg);
     } else {
         tcg_gen_andi_i64(ret, arg, 0xffffu);
     }
-#endif
 }
 
 void tcg_gen_ext32u_i64(TCGv_i64 ret, TCGv_i64 arg)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
-#else
-    if (TCG_TARGET_HAS_ext32u_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+        tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+    } else if (TCG_TARGET_HAS_ext32u_i64) {
         tcg_gen_op2_i64(INDEX_op_ext32u_i64, ret, arg);
     } else {
         tcg_gen_andi_i64(ret, arg, 0xffffffffu);
     }
-#endif
 }
 
 /* Note: we assume the six high bytes are set to zero */
 void tcg_gen_bswap16_i64(TCGv_i64 ret, TCGv_i64 arg)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_bswap16_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
-#else
-    if (TCG_TARGET_HAS_bswap16_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_bswap16_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+        tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+    } else if (TCG_TARGET_HAS_bswap16_i64) {
         tcg_gen_op2_i64(INDEX_op_bswap16_i64, ret, arg);
     } else {
         TCGv_i64 t0 = tcg_temp_new_i64();
@@ -1314,17 +1301,15 @@ void tcg_gen_bswap16_i64(TCGv_i64 ret, TCGv_i64 arg)
         tcg_gen_or_i64(ret, ret, t0);
         tcg_temp_free_i64(t0);
     }
-#endif
 }
 
 /* Note: we assume the four high bytes are set to zero */
 void tcg_gen_bswap32_i64(TCGv_i64 ret, TCGv_i64 arg)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_bswap32_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
-#else
-    if (TCG_TARGET_HAS_bswap32_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_bswap32_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+        tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+    } else if (TCG_TARGET_HAS_bswap32_i64) {
         tcg_gen_op2_i64(INDEX_op_bswap32_i64, ret, arg);
     } else {
         TCGv_i64 t0, t1;
@@ -1347,24 +1332,22 @@ void tcg_gen_bswap32_i64(TCGv_i64 ret, TCGv_i64 arg)
         tcg_temp_free_i64(t0);
         tcg_temp_free_i64(t1);
     }
-#endif
 }
 
 void tcg_gen_bswap64_i64(TCGv_i64 ret, TCGv_i64 arg)
 {
-#if TCG_TARGET_REG_BITS == 32
-    TCGv_i32 t0, t1;
-    t0 = tcg_temp_new_i32();
-    t1 = tcg_temp_new_i32();
+    if (TCG_TARGET_REG_BITS == 32) {
+        TCGv_i32 t0, t1;
+        t0 = tcg_temp_new_i32();
+        t1 = tcg_temp_new_i32();
 
-    tcg_gen_bswap32_i32(t0, TCGV_LOW(arg));
-    tcg_gen_bswap32_i32(t1, TCGV_HIGH(arg));
-    tcg_gen_mov_i32(TCGV_LOW(ret), t1);
-    tcg_gen_mov_i32(TCGV_HIGH(ret), t0);
-    tcg_temp_free_i32(t0);
-    tcg_temp_free_i32(t1);
-#else
-    if (TCG_TARGET_HAS_bswap64_i64) {
+        tcg_gen_bswap32_i32(t0, TCGV_LOW(arg));
+        tcg_gen_bswap32_i32(t1, TCGV_HIGH(arg));
+        tcg_gen_mov_i32(TCGV_LOW(ret), t1);
+        tcg_gen_mov_i32(TCGV_HIGH(ret), t0);
+        tcg_temp_free_i32(t0);
+        tcg_temp_free_i32(t1);
+    } else if (TCG_TARGET_HAS_bswap64_i64) {
         tcg_gen_op2_i64(INDEX_op_bswap64_i64, ret, arg);
     } else {
         TCGv_i64 t0 = tcg_temp_new_i64();
@@ -1401,27 +1384,26 @@ void tcg_gen_bswap64_i64(TCGv_i64 ret, TCGv_i64 arg)
         tcg_temp_free_i64(t0);
         tcg_temp_free_i64(t1);
     }
-#endif
 }
 
 void tcg_gen_not_i64(TCGv_i64 ret, TCGv_i64 arg)
 {
-#if TCG_TARGET_REG_BITS == 64
-    if (TCG_TARGET_HAS_not_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_not_i32(TCGV_LOW(ret), TCGV_LOW(arg));
+        tcg_gen_not_i32(TCGV_HIGH(ret), TCGV_HIGH(arg));
+    } else if (TCG_TARGET_HAS_not_i64) {
         tcg_gen_op2_i64(INDEX_op_not_i64, ret, arg);
     } else {
         tcg_gen_xori_i64(ret, arg, -1);
     }
-#else
-    tcg_gen_not_i32(TCGV_LOW(ret), TCGV_LOW(arg));
-    tcg_gen_not_i32(TCGV_HIGH(ret), TCGV_HIGH(arg));
-#endif
 }
 
 void tcg_gen_andc_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-#if TCG_TARGET_REG_BITS == 64
-    if (TCG_TARGET_HAS_andc_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_andc_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
+        tcg_gen_andc_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
+    } else if (TCG_TARGET_HAS_andc_i64) {
         tcg_gen_op3_i64(INDEX_op_andc_i64, ret, arg1, arg2);
     } else {
         TCGv_i64 t0 = tcg_temp_new_i64();
@@ -1429,61 +1411,53 @@ void tcg_gen_andc_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
         tcg_gen_and_i64(ret, arg1, t0);
         tcg_temp_free_i64(t0);
     }
-#else
-    tcg_gen_andc_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
-    tcg_gen_andc_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
-#endif
 }
 
 void tcg_gen_eqv_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-#if TCG_TARGET_REG_BITS == 64
-    if (TCG_TARGET_HAS_eqv_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_eqv_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
+        tcg_gen_eqv_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
+    } else if (TCG_TARGET_HAS_eqv_i64) {
         tcg_gen_op3_i64(INDEX_op_eqv_i64, ret, arg1, arg2);
     } else {
         tcg_gen_xor_i64(ret, arg1, arg2);
         tcg_gen_not_i64(ret, ret);
     }
-#else
-    tcg_gen_eqv_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
-    tcg_gen_eqv_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
-#endif
 }
 
 void tcg_gen_nand_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-#if TCG_TARGET_REG_BITS == 64
-    if (TCG_TARGET_HAS_nand_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_nand_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
+        tcg_gen_nand_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
+    } else if (TCG_TARGET_HAS_nand_i64) {
         tcg_gen_op3_i64(INDEX_op_nand_i64, ret, arg1, arg2);
     } else {
         tcg_gen_and_i64(ret, arg1, arg2);
         tcg_gen_not_i64(ret, ret);
     }
-#else
-    tcg_gen_nand_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
-    tcg_gen_nand_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
-#endif
 }
 
 void tcg_gen_nor_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-#if TCG_TARGET_REG_BITS == 64
-    if (TCG_TARGET_HAS_nor_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_nor_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
+        tcg_gen_nor_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
+    } else if (TCG_TARGET_HAS_nor_i64) {
         tcg_gen_op3_i64(INDEX_op_nor_i64, ret, arg1, arg2);
     } else {
         tcg_gen_or_i64(ret, arg1, arg2);
         tcg_gen_not_i64(ret, ret);
     }
-#else
-    tcg_gen_nor_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
-    tcg_gen_nor_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
-#endif
 }
 
 void tcg_gen_orc_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
 {
-#if TCG_TARGET_REG_BITS == 64
-    if (TCG_TARGET_HAS_orc_i64) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_orc_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
+        tcg_gen_orc_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
+    } else if (TCG_TARGET_HAS_orc_i64) {
         tcg_gen_op3_i64(INDEX_op_orc_i64, ret, arg1, arg2);
     } else {
         TCGv_i64 t0 = tcg_temp_new_i64();
@@ -1491,10 +1465,6 @@ void tcg_gen_orc_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
         tcg_gen_or_i64(ret, arg1, t0);
         tcg_temp_free_i64(t0);
     }
-#else
-    tcg_gen_orc_i32(TCGV_LOW(ret), TCGV_LOW(arg1), TCGV_LOW(arg2));
-    tcg_gen_orc_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1), TCGV_HIGH(arg2));
-#endif
 }
 
 void tcg_gen_rotl_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2)
@@ -1583,20 +1553,20 @@ void tcg_gen_deposit_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2,
         return;
     }
 
-#if TCG_TARGET_REG_BITS == 32
-    if (ofs >= 32) {
-        tcg_gen_deposit_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1),
-                            TCGV_LOW(arg2), ofs - 32, len);
-        tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg1));
-        return;
-    }
-    if (ofs + len <= 32) {
-        tcg_gen_deposit_i32(TCGV_LOW(ret), TCGV_LOW(arg1),
-                            TCGV_LOW(arg2), ofs, len);
-        tcg_gen_mov_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1));
-        return;
+    if (TCG_TARGET_REG_BITS == 32) {
+        if (ofs >= 32) {
+            tcg_gen_deposit_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1),
+                                TCGV_LOW(arg2), ofs - 32, len);
+            tcg_gen_mov_i32(TCGV_LOW(ret), TCGV_LOW(arg1));
+            return;
+        }
+        if (ofs + len <= 32) {
+            tcg_gen_deposit_i32(TCGV_LOW(ret), TCGV_LOW(arg1),
+                                TCGV_LOW(arg2), ofs, len);
+            tcg_gen_mov_i32(TCGV_HIGH(ret), TCGV_HIGH(arg1));
+            return;
+        }
     }
-#endif
 
     mask = (1ull << len) - 1;
     t1 = tcg_temp_new_i64();
@@ -1616,34 +1586,33 @@ void tcg_gen_deposit_i64(TCGv_i64 ret, TCGv_i64 arg1, TCGv_i64 arg2,
 void tcg_gen_movcond_i64(TCGCond cond, TCGv_i64 ret, TCGv_i64 c1,
                          TCGv_i64 c2, TCGv_i64 v1, TCGv_i64 v2)
 {
-#if TCG_TARGET_REG_BITS == 32
-    TCGv_i32 t0 = tcg_temp_new_i32();
-    TCGv_i32 t1 = tcg_temp_new_i32();
-    tcg_gen_op6i_i32(INDEX_op_setcond2_i32, t0,
-                     TCGV_LOW(c1), TCGV_HIGH(c1),
-                     TCGV_LOW(c2), TCGV_HIGH(c2), cond);
-
-    if (TCG_TARGET_HAS_movcond_i32) {
-        tcg_gen_movi_i32(t1, 0);
-        tcg_gen_movcond_i32(TCG_COND_NE, TCGV_LOW(ret), t0, t1,
-                            TCGV_LOW(v1), TCGV_LOW(v2));
-        tcg_gen_movcond_i32(TCG_COND_NE, TCGV_HIGH(ret), t0, t1,
-                            TCGV_HIGH(v1), TCGV_HIGH(v2));
-    } else {
-        tcg_gen_neg_i32(t0, t0);
+    if (TCG_TARGET_REG_BITS == 32) {
+        TCGv_i32 t0 = tcg_temp_new_i32();
+        TCGv_i32 t1 = tcg_temp_new_i32();
+        tcg_gen_op6i_i32(INDEX_op_setcond2_i32, t0,
+                         TCGV_LOW(c1), TCGV_HIGH(c1),
+                         TCGV_LOW(c2), TCGV_HIGH(c2), cond);
+
+        if (TCG_TARGET_HAS_movcond_i32) {
+            tcg_gen_movi_i32(t1, 0);
+            tcg_gen_movcond_i32(TCG_COND_NE, TCGV_LOW(ret), t0, t1,
+                                TCGV_LOW(v1), TCGV_LOW(v2));
+            tcg_gen_movcond_i32(TCG_COND_NE, TCGV_HIGH(ret), t0, t1,
+                                TCGV_HIGH(v1), TCGV_HIGH(v2));
+        } else {
+            tcg_gen_neg_i32(t0, t0);
 
-        tcg_gen_and_i32(t1, TCGV_LOW(v1), t0);
-        tcg_gen_andc_i32(TCGV_LOW(ret), TCGV_LOW(v2), t0);
-        tcg_gen_or_i32(TCGV_LOW(ret), TCGV_LOW(ret), t1);
+            tcg_gen_and_i32(t1, TCGV_LOW(v1), t0);
+            tcg_gen_andc_i32(TCGV_LOW(ret), TCGV_LOW(v2), t0);
+            tcg_gen_or_i32(TCGV_LOW(ret), TCGV_LOW(ret), t1);
 
-        tcg_gen_and_i32(t1, TCGV_HIGH(v1), t0);
-        tcg_gen_andc_i32(TCGV_HIGH(ret), TCGV_HIGH(v2), t0);
-        tcg_gen_or_i32(TCGV_HIGH(ret), TCGV_HIGH(ret), t1);
-    }
-    tcg_temp_free_i32(t0);
-    tcg_temp_free_i32(t1);
-#else
-    if (TCG_TARGET_HAS_movcond_i64) {
+            tcg_gen_and_i32(t1, TCGV_HIGH(v1), t0);
+            tcg_gen_andc_i32(TCGV_HIGH(ret), TCGV_HIGH(v2), t0);
+            tcg_gen_or_i32(TCGV_HIGH(ret), TCGV_HIGH(ret), t1);
+        }
+        tcg_temp_free_i32(t0);
+        tcg_temp_free_i32(t1);
+    } else if (TCG_TARGET_HAS_movcond_i64) {
         tcg_gen_op6i_i64(INDEX_op_movcond_i64, ret, c1, c2, v1, v2, cond);
     } else {
         TCGv_i64 t0 = tcg_temp_new_i64();
@@ -1656,7 +1625,6 @@ void tcg_gen_movcond_i64(TCGCond cond, TCGv_i64 ret, TCGv_i64 c1,
         tcg_temp_free_i64(t0);
         tcg_temp_free_i64(t1);
     }
-#endif
 }
 
 void tcg_gen_add2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 al,
@@ -1764,19 +1732,18 @@ void tcg_gen_muls2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 arg1, TCGv_i64 arg2)
 void tcg_gen_trunc_shr_i64_i32(TCGv_i32 ret, TCGv_i64 arg, unsigned count)
 {
     tcg_debug_assert(count < 64);
-#if TCG_TARGET_REG_BITS == 32
-    if (count >= 32) {
-        tcg_gen_shri_i32(ret, TCGV_HIGH(arg), count - 32);
-    } else if (count == 0) {
-        tcg_gen_mov_i32(ret, TCGV_LOW(arg));
-    } else {
-        TCGv_i64 t = tcg_temp_new_i64();
-        tcg_gen_shri_i64(t, arg, count);
-        tcg_gen_mov_i32(ret, TCGV_LOW(t));
-        tcg_temp_free_i64(t);
-    }
-#else
-    if (TCG_TARGET_HAS_trunc_shr_i32) {
+    if (TCG_TARGET_REG_BITS == 32) {
+        if (count >= 32) {
+            tcg_gen_shri_i32(ret, TCGV_HIGH(arg), count - 32);
+        } else if (count == 0) {
+            tcg_gen_mov_i32(ret, TCGV_LOW(arg));
+        } else {
+            TCGv_i64 t = tcg_temp_new_i64();
+            tcg_gen_shri_i64(t, arg, count);
+            tcg_gen_mov_i32(ret, TCGV_LOW(t));
+            tcg_temp_free_i64(t);
+        }
+    } else if (TCG_TARGET_HAS_trunc_shr_i32) {
         tcg_gen_op3i_i32(INDEX_op_trunc_shr_i32, ret,
                          MAKE_TCGV_I32(GET_TCGV_I64(arg)), count);
     } else if (count == 0) {
@@ -1787,40 +1754,43 @@ void tcg_gen_trunc_shr_i64_i32(TCGv_i32 ret, TCGv_i64 arg, unsigned count)
         tcg_gen_mov_i32(ret, MAKE_TCGV_I32(GET_TCGV_I64(t)));
         tcg_temp_free_i64(t);
     }
-#endif
 }
 
 void tcg_gen_extu_i32_i64(TCGv_i64 ret, TCGv_i32 arg)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_mov_i32(TCGV_LOW(ret), arg);
-    tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
-#else
-    /* Note: we assume the target supports move between
-       32 and 64 bit registers.  */
-    tcg_gen_ext32u_i64(ret, MAKE_TCGV_I64(GET_TCGV_I32(arg)));
-#endif
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_mov_i32(TCGV_LOW(ret), arg);
+        tcg_gen_movi_i32(TCGV_HIGH(ret), 0);
+    } else {
+        /* Note: we assume the target supports move between
+           32 and 64 bit registers.  */
+        tcg_gen_ext32u_i64(ret, MAKE_TCGV_I64(GET_TCGV_I32(arg)));
+    }
 }
 
 void tcg_gen_ext_i32_i64(TCGv_i64 ret, TCGv_i32 arg)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_mov_i32(TCGV_LOW(ret), arg);
-    tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
-#else
-    /* Note: we assume the target supports move between
-       32 and 64 bit registers.  */
-    tcg_gen_ext32s_i64(ret, MAKE_TCGV_I64(GET_TCGV_I32(arg)));
-#endif
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_mov_i32(TCGV_LOW(ret), arg);
+        tcg_gen_sari_i32(TCGV_HIGH(ret), TCGV_LOW(ret), 31);
+    } else {
+        /* Note: we assume the target supports move between
+           32 and 64 bit registers.  */
+        tcg_gen_ext32s_i64(ret, MAKE_TCGV_I64(GET_TCGV_I32(arg)));
+    }
 }
 
 void tcg_gen_concat_i32_i64(TCGv_i64 dest, TCGv_i32 low, TCGv_i32 high)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_mov_i32(TCGV_LOW(dest), low);
-    tcg_gen_mov_i32(TCGV_HIGH(dest), high);
-#else
-    TCGv_i64 tmp = tcg_temp_new_i64();
+    TCGv_i64 tmp;
+
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_mov_i32(TCGV_LOW(dest), low);
+        tcg_gen_mov_i32(TCGV_HIGH(dest), high);
+        return;
+    }
+
+    tmp = tcg_temp_new_i64();
     /* These extensions are only needed for type correctness.
        We may be able to do better given target specific information.  */
     tcg_gen_extu_i32_i64(tmp, high);
@@ -1834,18 +1804,17 @@ void tcg_gen_concat_i32_i64(TCGv_i64 dest, TCGv_i32 low, TCGv_i32 high)
         tcg_gen_or_i64(dest, dest, tmp);
     }
     tcg_temp_free_i64(tmp);
-#endif
 }
 
 void tcg_gen_extr_i64_i32(TCGv_i32 lo, TCGv_i32 hi, TCGv_i64 arg)
 {
-#if TCG_TARGET_REG_BITS == 32
-    tcg_gen_mov_i32(lo, TCGV_LOW(arg));
-    tcg_gen_mov_i32(hi, TCGV_HIGH(arg));
-#else
-    tcg_gen_trunc_shr_i64_i32(lo, arg, 0);
-    tcg_gen_trunc_shr_i64_i32(hi, arg, 32);
-#endif
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_mov_i32(lo, TCGV_LOW(arg));
+        tcg_gen_mov_i32(hi, TCGV_HIGH(arg));
+    } else {
+        tcg_gen_trunc_shr_i64_i32(lo, arg, 0);
+        tcg_gen_trunc_shr_i64_i32(hi, arg, 32);
+    }
 }
 
 void tcg_gen_extr32_i64(TCGv_i64 lo, TCGv_i64 hi, TCGv_i64 arg)
@@ -1900,12 +1869,12 @@ static inline void tcg_add_param_i32(TCGv_i32 val)
 
 static inline void tcg_add_param_i64(TCGv_i64 val)
 {
-#if TCG_TARGET_REG_BITS == 32
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(TCGV_LOW(val));
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(TCGV_HIGH(val));
-#else
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(val);
-#endif
+    if (TCG_TARGET_REG_BITS == 32) {
+        *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(TCGV_LOW(val));
+        *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(TCGV_HIGH(val));
+    } else {
+        *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(val);
+    }
 }
 
 #if TARGET_LONG_BITS == 32
@@ -1940,8 +1909,7 @@ void tcg_gen_qemu_ld_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 {
     memop = tcg_canonicalize_memop(memop, 1, 0);
 
-#if TCG_TARGET_REG_BITS == 32
-    if ((memop & MO_SIZE) < MO_64) {
+    if (TCG_TARGET_REG_BITS == 32 && (memop & MO_SIZE) < MO_64) {
         tcg_gen_qemu_ld_i32(TCGV_LOW(val), addr, idx, memop);
         if (memop & MO_SIGN) {
             tcg_gen_sari_i32(TCGV_HIGH(val), TCGV_LOW(val), 31);
@@ -1950,7 +1918,6 @@ void tcg_gen_qemu_ld_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
         }
         return;
     }
-#endif
 
     *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_ld_i64;
     tcg_add_param_i64(val);
@@ -1963,12 +1930,10 @@ void tcg_gen_qemu_st_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 {
     memop = tcg_canonicalize_memop(memop, 1, 1);
 
-#if TCG_TARGET_REG_BITS == 32
-    if ((memop & MO_SIZE) < MO_64) {
+    if (TCG_TARGET_REG_BITS == 32 && (memop & MO_SIZE) < MO_64) {
         tcg_gen_qemu_st_i32(TCGV_LOW(val), addr, idx, memop);
         return;
     }
-#endif
 
     *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_st_i64;
     tcg_add_param_i64(val);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2.3 3/8] tcg: Move emit of INDEX_op_end into gen_tb_end
  2014-11-11 16:24 [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Richard Henderson
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 1/8] tcg: Move some opcode generation functions out of line Richard Henderson
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 2/8] tcg: Reduce ifdefs in tcg-op.c Richard Henderson
@ 2014-11-11 16:24 ` Richard Henderson
  2014-11-13 15:57   ` Bastian Koppelmann
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 4/8] tcg: Introduce tcg_op_buf_count and tcg_op_buf_full Richard Henderson
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Richard Henderson @ 2014-11-11 16:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: aurelien

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 include/exec/gen-icount.h     | 2 ++
 target-alpha/translate.c      | 2 +-
 target-arm/translate-a64.c    | 1 -
 target-arm/translate.c        | 1 -
 target-cris/translate.c       | 2 +-
 target-i386/translate.c       | 2 +-
 target-lm32/translate.c       | 2 +-
 target-m68k/translate.c       | 1 -
 target-microblaze/translate.c | 2 +-
 target-mips/translate.c       | 2 +-
 target-moxie/translate.c      | 2 +-
 target-openrisc/translate.c   | 2 +-
 target-ppc/translate.c        | 2 +-
 target-s390x/translate.c      | 2 +-
 target-sh4/translate.c        | 2 +-
 target-sparc/translate.c      | 2 +-
 target-tricore/translate.c    | 1 -
 target-unicore32/translate.c  | 1 -
 target-xtensa/translate.c     | 1 -
 19 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
index da53395..d5266ff 100644
--- a/include/exec/gen-icount.h
+++ b/include/exec/gen-icount.h
@@ -48,6 +48,8 @@ static void gen_tb_end(TranslationBlock *tb, int num_insns)
         gen_set_label(icount_label);
         tcg_gen_exit_tb((uintptr_t)tb + TB_EXIT_ICOUNT_EXPIRED);
     }
+
+    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
 }
 
 static inline void gen_io_start(void)
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index 76658a0..bb85c2d 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -2912,7 +2912,7 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
     }
 
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
+
     if (search_pc) {
         j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
         lj++;
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 80d2c07..220ebba 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -11090,7 +11090,6 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
 
 done_generating:
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
 
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
diff --git a/target-arm/translate.c b/target-arm/translate.c
index af51568..c31c3dc 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -11325,7 +11325,6 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
 
 done_generating:
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
 
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
diff --git a/target-cris/translate.c b/target-cris/translate.c
index e37b04e..e9acaeb 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3348,7 +3348,7 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
         }
     }
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
+
     if (search_pc) {
         j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
         lj++;
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 782f7d2..1a2e610 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -8040,7 +8040,7 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
         gen_io_end();
 done_generating:
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
+
     /* we don't forget to fill the last values */
     if (search_pc) {
         j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index 8454e8b..482b8dd 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1158,7 +1158,7 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
     }
 
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
+
     if (search_pc) {
         j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
         lj++;
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index efd4cfc..2c396ef 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -3075,7 +3075,6 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
         }
     }
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
 
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index fd2b771..0e3d612 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1846,7 +1846,7 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
         }
     }
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
+
     if (search_pc) {
         j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
         lj++;
diff --git a/target-mips/translate.c b/target-mips/translate.c
index f0b8e6f..ff51bc9 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -19133,7 +19133,7 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
     }
 done_generating:
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
+
     if (search_pc) {
         j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
         lj++;
diff --git a/target-moxie/translate.c b/target-moxie/translate.c
index 4541b9b..675c4d0 100644
--- a/target-moxie/translate.c
+++ b/target-moxie/translate.c
@@ -900,7 +900,7 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
     }
  done_generating:
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
+
     if (search_pc) {
         j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
         lj++;
diff --git a/target-openrisc/translate.c b/target-openrisc/translate.c
index 407bd97..9dcc9ae 100644
--- a/target-openrisc/translate.c
+++ b/target-openrisc/translate.c
@@ -1759,7 +1759,7 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
     }
 
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
+
     if (search_pc) {
         j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
         k++;
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 910ce56..9ab81a9 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11449,7 +11449,7 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
         tcg_gen_exit_tb(0);
     }
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
+
     if (unlikely(search_pc)) {
         j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
         lj++;
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index dbf1993..661d110 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -4856,7 +4856,7 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
     }
 
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
+
     if (search_pc) {
         j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
         lj++;
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index 3088edc..0550611 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1962,7 +1962,7 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
     }
 
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
+
     if (search_pc) {
         i = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
         ii++;
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index 78c4e21..fc239bd 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -5342,7 +5342,7 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
         }
     }
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
+
     if (spc) {
         j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
         lj++;
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index d5a9596..48b1a5e 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -2468,7 +2468,6 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
     }
 
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
     if (search_pc) {
         printf("done_generating search pc\n");
     } else {
diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c
index 653c225..04ee3c2 100644
--- a/target-unicore32/translate.c
+++ b/target-unicore32/translate.c
@@ -2037,7 +2037,6 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
 
 done_generating:
     gen_tb_end(tb, num_insns);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
 
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index badca19..8e135fc 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -3097,7 +3097,6 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
         gen_jumpi(&dc, dc.pc, 0);
     }
     gen_tb_end(tb, insn_count);
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
 
 #ifdef DEBUG_DISAS
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2.3 4/8] tcg: Introduce tcg_op_buf_count and tcg_op_buf_full
  2014-11-11 16:24 [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Richard Henderson
                   ` (2 preceding siblings ...)
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 3/8] tcg: Move emit of INDEX_op_end into gen_tb_end Richard Henderson
@ 2014-11-11 16:24 ` Richard Henderson
  2014-11-13 16:13   ` Bastian Koppelmann
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 5/8] tcg: Put opcodes in a linked list Richard Henderson
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Richard Henderson @ 2014-11-11 16:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: aurelien

The method by which we count the number of ops emitted
is going to change.  Abstract that away into some inlines.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target-alpha/translate.c      | 14 +++++++-------
 target-arm/translate-a64.c    |  9 +++------
 target-arm/translate.c        |  9 +++------
 target-cris/translate.c       | 13 +++++--------
 target-i386/translate.c       |  9 +++------
 target-lm32/translate.c       | 14 +++++---------
 target-m68k/translate.c       |  9 +++------
 target-microblaze/translate.c | 20 ++++++++------------
 target-mips/translate.c       |  8 +++-----
 target-moxie/translate.c      |  8 +++-----
 target-openrisc/translate.c   | 13 +++++--------
 target-ppc/translate.c        |  9 +++------
 target-s390x/translate.c      |  9 +++------
 target-sh4/translate.c        |  8 +++-----
 target-sparc/translate.c      |  8 +++-----
 target-tricore/translate.c    |  4 +---
 target-unicore32/translate.c  |  9 +++------
 target-xtensa/translate.c     |  7 +++----
 tcg/tcg.h                     | 12 ++++++++++++
 19 files changed, 79 insertions(+), 113 deletions(-)

diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index bb85c2d..5bec586 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -2790,7 +2790,6 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
     target_ulong pc_start;
     target_ulong pc_mask;
     uint32_t insn;
-    uint16_t *gen_opc_end;
     CPUBreakpoint *bp;
     int j, lj = -1;
     ExitStatus ret;
@@ -2798,7 +2797,6 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
     int max_insns;
 
     pc_start = tb->pc;
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
     ctx.tb = tb;
     ctx.pc = pc_start;
@@ -2839,11 +2837,12 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
             }
         }
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
-                while (lj < j)
+                while (lj < j) {
                     tcg_ctx.gen_opc_instr_start[lj++] = 0;
+                }
             }
             tcg_ctx.gen_opc_pc[lj] = ctx.pc;
             tcg_ctx.gen_opc_instr_start[lj] = 1;
@@ -2881,7 +2880,7 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
            or exhaust instruction count, stop generation.  */
         if (ret == NO_EXIT
             && ((ctx.pc & pc_mask) == 0
-                || tcg_ctx.gen_opc_ptr >= gen_opc_end
+                || tcg_op_buf_full()
                 || num_insns >= max_insns
                 || singlestep
                 || ctx.singlestep_enabled)) {
@@ -2914,10 +2913,11 @@ static inline void gen_intermediate_code_internal(AlphaCPU *cpu,
     gen_tb_end(tb, num_insns);
 
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
-        while (lj <= j)
+        while (lj <= j) {
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
+        }
     } else {
         tb->size = ctx.pc - pc_start;
         tb->icount = num_insns;
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 220ebba..1e4e4ef 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -10899,7 +10899,6 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
     CPUARMState *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
     CPUBreakpoint *bp;
-    uint16_t *gen_opc_end;
     int j, lj;
     target_ulong pc_start;
     target_ulong next_page_start;
@@ -10910,8 +10909,6 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
 
     dc->tb = tb;
 
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
-
     dc->is_jmp = DISAS_NEXT;
     dc->pc = pc_start;
     dc->singlestep_enabled = cs->singlestep_enabled;
@@ -10980,7 +10977,7 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
         }
 
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j) {
@@ -11030,7 +11027,7 @@ void gen_intermediate_code_internal_a64(ARMCPU *cpu,
          * ensures prefetch aborts occur at the right place.
          */
         num_insns++;
-    } while (!dc->is_jmp && tcg_ctx.gen_opc_ptr < gen_opc_end &&
+    } while (!dc->is_jmp && !tcg_op_buf_full() &&
              !cs->singlestep_enabled &&
              !singlestep &&
              !dc->ss_active &&
@@ -11101,7 +11098,7 @@ done_generating:
     }
 #endif
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
         while (lj <= j) {
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
diff --git a/target-arm/translate.c b/target-arm/translate.c
index c31c3dc..32d8505 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -10995,7 +10995,6 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
     CPUARMState *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
     CPUBreakpoint *bp;
-    uint16_t *gen_opc_end;
     int j, lj;
     target_ulong pc_start;
     target_ulong next_page_start;
@@ -11016,8 +11015,6 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
 
     dc->tb = tb;
 
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
-
     dc->is_jmp = DISAS_NEXT;
     dc->pc = pc_start;
     dc->singlestep_enabled = cs->singlestep_enabled;
@@ -11150,7 +11147,7 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
             }
         }
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j)
@@ -11216,7 +11213,7 @@ static inline void gen_intermediate_code_internal(ARMCPU *cpu,
          * Also stop translation when a page boundary is reached.  This
          * ensures prefetch aborts occur at the right place.  */
         num_insns ++;
-    } while (!dc->is_jmp && tcg_ctx.gen_opc_ptr < gen_opc_end &&
+    } while (!dc->is_jmp && !tcg_op_buf_full() &&
              !cs->singlestep_enabled &&
              !singlestep &&
              !dc->ss_active &&
@@ -11336,7 +11333,7 @@ done_generating:
     }
 #endif
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
         while (lj <= j)
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
diff --git a/target-cris/translate.c b/target-cris/translate.c
index e9acaeb..1f39e73 100644
--- a/target-cris/translate.c
+++ b/target-cris/translate.c
@@ -3120,7 +3120,6 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
 {
     CPUState *cs = CPU(cpu);
     CPUCRISState *env = &cpu->env;
-    uint16_t *gen_opc_end;
     uint32_t pc_start;
     unsigned int insn_len;
     int j, lj;
@@ -3146,8 +3145,6 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
     dc->cpu = cpu;
     dc->tb = tb;
 
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
-
     dc->is_jmp = DISAS_NEXT;
     dc->ppc = pc_start;
     dc->pc = pc_start;
@@ -3211,7 +3208,7 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
         check_breakpoint(env, dc);
 
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j) {
@@ -3295,7 +3292,7 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
             break;
         }
     } while (!dc->is_jmp && !dc->cpustate_changed
-            && tcg_ctx.gen_opc_ptr < gen_opc_end
+            && !tcg_op_buf_full()
             && !singlestep
             && (dc->pc < next_page_start)
             && num_insns < max_insns);
@@ -3350,7 +3347,7 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
     gen_tb_end(tb, num_insns);
 
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
         while (lj <= j) {
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
@@ -3365,8 +3362,8 @@ gen_intermediate_code_internal(CRISCPU *cpu, TranslationBlock *tb,
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
         log_target_disas(env, pc_start, dc->pc - pc_start,
                          env->pregs[PR_VR]);
-        qemu_log("\nisize=%d osize=%td\n",
-            dc->pc - pc_start, tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf);
+        qemu_log("\nisize=%d osize=%d\n",
+                 dc->pc - pc_start, tcg_op_buf_count());
     }
 #endif
 #endif
diff --git a/target-i386/translate.c b/target-i386/translate.c
index 1a2e610..6977a2f 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -7901,7 +7901,6 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
     CPUX86State *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
     target_ulong pc_ptr;
-    uint16_t *gen_opc_end;
     CPUBreakpoint *bp;
     int j, lj;
     uint64_t flags;
@@ -7970,8 +7969,6 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
     cpu_ptr1 = tcg_temp_new_ptr();
     cpu_cc_srcT = tcg_temp_local_new();
 
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
-
     dc->is_jmp = DISAS_NEXT;
     pc_ptr = pc_start;
     lj = -1;
@@ -7992,7 +7989,7 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
             }
         }
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j)
@@ -8023,7 +8020,7 @@ static inline void gen_intermediate_code_internal(X86CPU *cpu,
             break;
         }
         /* if too long translation, stop generation too */
-        if (tcg_ctx.gen_opc_ptr >= gen_opc_end ||
+        if (tcg_op_buf_full() ||
             (pc_ptr - pc_start) >= (TARGET_PAGE_SIZE - 32) ||
             num_insns >= max_insns) {
             gen_jmp_im(pc_ptr - dc->cs_base);
@@ -8043,7 +8040,7 @@ done_generating:
 
     /* we don't forget to fill the last values */
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
         while (lj <= j)
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
diff --git a/target-lm32/translate.c b/target-lm32/translate.c
index 482b8dd..79b67d1 100644
--- a/target-lm32/translate.c
+++ b/target-lm32/translate.c
@@ -1062,7 +1062,6 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
     CPUState *cs = CPU(cpu);
     CPULM32State *env = &cpu->env;
     struct DisasContext ctx, *dc = &ctx;
-    uint16_t *gen_opc_end;
     uint32_t pc_start;
     int j, lj;
     uint32_t next_page_start;
@@ -1075,8 +1074,6 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
     dc->num_watchpoints = cpu->num_watchpoints;
     dc->tb = tb;
 
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
-
     dc->is_jmp = DISAS_NEXT;
     dc->pc = pc_start;
     dc->singlestep_enabled = cs->singlestep_enabled;
@@ -1100,7 +1097,7 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
         check_breakpoint(env, dc);
 
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j) {
@@ -1124,7 +1121,7 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
         num_insns++;
 
     } while (!dc->is_jmp
-         && tcg_ctx.gen_opc_ptr < gen_opc_end
+         && !tcg_op_buf_full()
          && !cs->singlestep_enabled
          && !singlestep
          && (dc->pc < next_page_start)
@@ -1160,7 +1157,7 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
     gen_tb_end(tb, num_insns);
 
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
         while (lj <= j) {
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
@@ -1174,9 +1171,8 @@ void gen_intermediate_code_internal(LM32CPU *cpu,
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
         qemu_log("\n");
         log_target_disas(env, pc_start, dc->pc - pc_start, 0);
-        qemu_log("\nisize=%d osize=%td\n",
-            dc->pc - pc_start, tcg_ctx.gen_opc_ptr -
-            tcg_ctx.gen_opc_buf);
+        qemu_log("\nisize=%d osize=%d\n",
+                 dc->pc - pc_start, tcg_op_buf_count());
     }
 #endif
 }
diff --git a/target-m68k/translate.c b/target-m68k/translate.c
index 2c396ef..5c2f4d0 100644
--- a/target-m68k/translate.c
+++ b/target-m68k/translate.c
@@ -2980,7 +2980,6 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
     CPUState *cs = CPU(cpu);
     CPUM68KState *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
-    uint16_t *gen_opc_end;
     CPUBreakpoint *bp;
     int j, lj;
     target_ulong pc_start;
@@ -2993,8 +2992,6 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
 
     dc->tb = tb;
 
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
-
     dc->env = env;
     dc->is_jmp = DISAS_NEXT;
     dc->pc = pc_start;
@@ -3026,7 +3023,7 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
                 break;
         }
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j)
@@ -3041,7 +3038,7 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
         dc->insn_pc = dc->pc;
 	disas_m68k_insn(env, dc);
         num_insns++;
-    } while (!dc->is_jmp && tcg_ctx.gen_opc_ptr < gen_opc_end &&
+    } while (!dc->is_jmp && !tcg_op_buf_full() &&
              !cs->singlestep_enabled &&
              !singlestep &&
              (pc_offset) < (TARGET_PAGE_SIZE - 32) &&
@@ -3085,7 +3082,7 @@ gen_intermediate_code_internal(M68kCPU *cpu, TranslationBlock *tb,
     }
 #endif
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
         while (lj <= j)
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
diff --git a/target-microblaze/translate.c b/target-microblaze/translate.c
index 0e3d612..d37f235 100644
--- a/target-microblaze/translate.c
+++ b/target-microblaze/translate.c
@@ -1673,7 +1673,6 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
 {
     CPUState *cs = CPU(cpu);
     CPUMBState *env = &cpu->env;
-    uint16_t *gen_opc_end;
     uint32_t pc_start;
     int j, lj;
     struct DisasContext ctx;
@@ -1688,8 +1687,6 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
     dc->tb = tb;
     org_flags = dc->synced_flags = dc->tb_flags = tb->flags;
 
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
-
     dc->is_jmp = DISAS_NEXT;
     dc->jmp = 0;
     dc->delayed_branch = !!(dc->tb_flags & D_FLAG);
@@ -1732,7 +1729,7 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
         check_breakpoint(env, dc);
 
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j)
@@ -1795,10 +1792,10 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
             break;
         }
     } while (!dc->is_jmp && !dc->cpustate_changed
-         && tcg_ctx.gen_opc_ptr < gen_opc_end
-                 && !singlestep
-         && (dc->pc < next_page_start)
-                 && num_insns < max_insns);
+             && !tcg_op_buf_full()
+             && !singlestep
+             && (dc->pc < next_page_start)
+             && num_insns < max_insns);
 
     npc = dc->pc;
     if (dc->jmp == JMP_DIRECT || dc->jmp == JMP_DIRECT_CC) {
@@ -1848,7 +1845,7 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
     gen_tb_end(tb, num_insns);
 
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
         while (lj <= j)
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
@@ -1864,9 +1861,8 @@ gen_intermediate_code_internal(MicroBlazeCPU *cpu, TranslationBlock *tb,
 #if DISAS_GNU
         log_target_disas(env, pc_start, dc->pc - pc_start, 0);
 #endif
-        qemu_log("\nisize=%d osize=%td\n",
-            dc->pc - pc_start, tcg_ctx.gen_opc_ptr -
-            tcg_ctx.gen_opc_buf);
+        qemu_log("\nisize=%d osize=%d\n",
+                 dc->pc - pc_start, tcg_op_buf_count());
     }
 #endif
 #endif
diff --git a/target-mips/translate.c b/target-mips/translate.c
index ff51bc9..e3d6684 100644
--- a/target-mips/translate.c
+++ b/target-mips/translate.c
@@ -18984,7 +18984,6 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
     CPUMIPSState *env = &cpu->env;
     DisasContext ctx;
     target_ulong pc_start;
-    uint16_t *gen_opc_end;
     CPUBreakpoint *bp;
     int j, lj = -1;
     int num_insns;
@@ -18996,7 +18995,6 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
         qemu_log("search pc %d\n", search_pc);
 
     pc_start = tb->pc;
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
     ctx.pc = pc_start;
     ctx.saved_pc = -1;
     ctx.singlestep_enabled = cs->singlestep_enabled;
@@ -19040,7 +19038,7 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
         }
 
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j)
@@ -19098,7 +19096,7 @@ gen_intermediate_code_internal(MIPSCPU *cpu, TranslationBlock *tb,
         if ((ctx.pc & (TARGET_PAGE_SIZE - 1)) == 0)
             break;
 
-        if (tcg_ctx.gen_opc_ptr >= gen_opc_end) {
+        if (tcg_op_buf_full()) {
             break;
         }
 
@@ -19135,7 +19133,7 @@ done_generating:
     gen_tb_end(tb, num_insns);
 
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
         while (lj <= j)
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
diff --git a/target-moxie/translate.c b/target-moxie/translate.c
index 675c4d0..cc9af66 100644
--- a/target-moxie/translate.c
+++ b/target-moxie/translate.c
@@ -827,14 +827,12 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
     CPUState *cs = CPU(cpu);
     DisasContext ctx;
     target_ulong pc_start;
-    uint16_t *gen_opc_end;
     CPUBreakpoint *bp;
     int j, lj = -1;
     CPUMoxieState *env = &cpu->env;
     int num_insns;
 
     pc_start = tb->pc;
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
     ctx.pc = pc_start;
     ctx.saved_pc = -1;
     ctx.tb = tb;
@@ -857,7 +855,7 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
         }
 
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j) {
@@ -879,7 +877,7 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
         if ((ctx.pc & (TARGET_PAGE_SIZE - 1)) == 0) {
             break;
         }
-    } while (ctx.bstate == BS_NONE && tcg_ctx.gen_opc_ptr < gen_opc_end);
+    } while (ctx.bstate == BS_NONE && !tcg_op_buf_full());
 
     if (cs->singlestep_enabled) {
         tcg_gen_movi_tl(cpu_pc, ctx.pc);
@@ -902,7 +900,7 @@ gen_intermediate_code_internal(MoxieCPU *cpu, TranslationBlock *tb,
     gen_tb_end(tb, num_insns);
 
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
         while (lj <= j) {
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
diff --git a/target-openrisc/translate.c b/target-openrisc/translate.c
index 9dcc9ae..fdc1a16 100644
--- a/target-openrisc/translate.c
+++ b/target-openrisc/translate.c
@@ -1642,7 +1642,6 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
 {
     CPUState *cs = CPU(cpu);
     struct DisasContext ctx, *dc = &ctx;
-    uint16_t *gen_opc_end;
     uint32_t pc_start;
     int j, k;
     uint32_t next_page_start;
@@ -1652,7 +1651,6 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
     pc_start = tb->pc;
     dc->tb = tb;
 
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
     dc->is_jmp = DISAS_NEXT;
     dc->ppc = pc_start;
     dc->pc = pc_start;
@@ -1680,7 +1678,7 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
     do {
         check_breakpoint(cpu, dc);
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (k < j) {
                 k++;
                 while (k < j) {
@@ -1721,7 +1719,7 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
             }
         }
     } while (!dc->is_jmp
-             && tcg_ctx.gen_opc_ptr < gen_opc_end
+             && !tcg_op_buf_full()
              && !cs->singlestep_enabled
              && !singlestep
              && (dc->pc < next_page_start)
@@ -1761,7 +1759,7 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
     gen_tb_end(tb, num_insns);
 
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         k++;
         while (k <= j) {
             tcg_ctx.gen_opc_instr_start[k++] = 0;
@@ -1775,9 +1773,8 @@ static inline void gen_intermediate_code_internal(OpenRISCCPU *cpu,
     if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)) {
         qemu_log("\n");
         log_target_disas(&cpu->env, pc_start, dc->pc - pc_start, 0);
-        qemu_log("\nisize=%d osize=%td\n",
-            dc->pc - pc_start, tcg_ctx.gen_opc_ptr -
-            tcg_ctx.gen_opc_buf);
+        qemu_log("\nisize=%d osize=%d\n",
+                 dc->pc - pc_start, tcg_op_buf_count());
     }
 #endif
 }
diff --git a/target-ppc/translate.c b/target-ppc/translate.c
index 9ab81a9..0812418 100644
--- a/target-ppc/translate.c
+++ b/target-ppc/translate.c
@@ -11273,14 +11273,12 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
     DisasContext ctx, *ctxp = &ctx;
     opc_handler_t **table, *handler;
     target_ulong pc_start;
-    uint16_t *gen_opc_end;
     CPUBreakpoint *bp;
     int j, lj = -1;
     int num_insns;
     int max_insns;
 
     pc_start = tb->pc;
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
     ctx.nip = pc_start;
     ctx.tb = tb;
     ctx.exception = POWERPC_EXCP_NONE;
@@ -11332,8 +11330,7 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
     gen_tb_start();
     tcg_clear_temp_count();
     /* Set env in case of segfault during code fetch */
-    while (ctx.exception == POWERPC_EXCP_NONE
-            && tcg_ctx.gen_opc_ptr < gen_opc_end) {
+    while (ctx.exception == POWERPC_EXCP_NONE && !tcg_op_buf_full()) {
         if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
             QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
                 if (bp->pc == ctx.nip) {
@@ -11343,7 +11340,7 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
             }
         }
         if (unlikely(search_pc)) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j)
@@ -11451,7 +11448,7 @@ static inline void gen_intermediate_code_internal(PowerPCCPU *cpu,
     gen_tb_end(tb, num_insns);
 
     if (unlikely(search_pc)) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
         while (lj <= j)
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
diff --git a/target-s390x/translate.c b/target-s390x/translate.c
index 661d110..215ef2e 100644
--- a/target-s390x/translate.c
+++ b/target-s390x/translate.c
@@ -4750,7 +4750,6 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
     DisasContext dc;
     target_ulong pc_start;
     uint64_t next_page_start;
-    uint16_t *gen_opc_end;
     int j, lj = -1;
     int num_insns, max_insns;
     CPUBreakpoint *bp;
@@ -4769,8 +4768,6 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
     dc.cc_op = CC_OP_DYNAMIC;
     do_debug = dc.singlestep_enabled = cs->singlestep_enabled;
 
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
-
     next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
 
     num_insns = 0;
@@ -4783,7 +4780,7 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
 
     do {
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j) {
@@ -4821,7 +4818,7 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
            or exhaust instruction count, stop generation.  */
         if (status == NO_EXIT
             && (dc.pc >= next_page_start
-                || tcg_ctx.gen_opc_ptr >= gen_opc_end
+                || tcg_op_buf_full()
                 || num_insns >= max_insns
                 || singlestep
                 || cs->singlestep_enabled)) {
@@ -4858,7 +4855,7 @@ static inline void gen_intermediate_code_internal(S390CPU *cpu,
     gen_tb_end(tb, num_insns);
 
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
         while (lj <= j) {
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index 0550611..bc82044 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -1865,14 +1865,12 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
     CPUSH4State *env = &cpu->env;
     DisasContext ctx;
     target_ulong pc_start;
-    static uint16_t *gen_opc_end;
     CPUBreakpoint *bp;
     int i, ii;
     int num_insns;
     int max_insns;
 
     pc_start = tb->pc;
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
     ctx.pc = pc_start;
     ctx.flags = (uint32_t)tb->flags;
     ctx.bstate = BS_NONE;
@@ -1891,7 +1889,7 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
     if (max_insns == 0)
         max_insns = CF_COUNT_MASK;
     gen_tb_start();
-    while (ctx.bstate == BS_NONE && tcg_ctx.gen_opc_ptr < gen_opc_end) {
+    while (ctx.bstate == BS_NONE && !tcg_op_buf_full()) {
         if (unlikely(!QTAILQ_EMPTY(&cs->breakpoints))) {
             QTAILQ_FOREACH(bp, &cs->breakpoints, entry) {
                 if (ctx.pc == bp->pc) {
@@ -1904,7 +1902,7 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
 	    }
 	}
         if (search_pc) {
-            i = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            i = tcg_op_buf_count();
             if (ii < i) {
                 ii++;
                 while (ii < i)
@@ -1964,7 +1962,7 @@ gen_intermediate_code_internal(SuperHCPU *cpu, TranslationBlock *tb,
     gen_tb_end(tb, num_insns);
 
     if (search_pc) {
-        i = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        i = tcg_op_buf_count();
         ii++;
         while (ii <= i)
             tcg_ctx.gen_opc_instr_start[ii++] = 0;
diff --git a/target-sparc/translate.c b/target-sparc/translate.c
index fc239bd..99b4f53 100644
--- a/target-sparc/translate.c
+++ b/target-sparc/translate.c
@@ -5245,7 +5245,6 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
     CPUState *cs = CPU(cpu);
     CPUSPARCState *env = &cpu->env;
     target_ulong pc_start, last_pc;
-    uint16_t *gen_opc_end;
     DisasContext dc1, *dc = &dc1;
     CPUBreakpoint *bp;
     int j, lj = -1;
@@ -5265,7 +5264,6 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
     dc->fpu_enabled = tb_fpu_enabled(tb->flags);
     dc->address_mask_32bit = tb_am_enabled(tb->flags);
     dc->singlestep = (cs->singlestep_enabled || singlestep);
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
 
     num_insns = 0;
     max_insns = tb->cflags & CF_COUNT_MASK;
@@ -5287,7 +5285,7 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
         }
         if (spc) {
             qemu_log("Search PC...\n");
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j)
@@ -5320,7 +5318,7 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
         if (dc->singlestep) {
             break;
         }
-    } while ((tcg_ctx.gen_opc_ptr < gen_opc_end) &&
+    } while (!tcg_op_buf_full() &&
              (dc->pc - pc_start) < (TARGET_PAGE_SIZE - 32) &&
              num_insns < max_insns);
 
@@ -5344,7 +5342,7 @@ static inline void gen_intermediate_code_internal(SPARCCPU *cpu,
     gen_tb_end(tb, num_insns);
 
     if (spc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
         while (lj <= j)
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
diff --git a/target-tricore/translate.c b/target-tricore/translate.c
index 48b1a5e..68e3c53 100644
--- a/target-tricore/translate.c
+++ b/target-tricore/translate.c
@@ -2430,7 +2430,6 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
     DisasContext ctx;
     target_ulong pc_start;
     int num_insns;
-    uint16_t *gen_opc_end;
 
     if (search_pc) {
         qemu_log("search pc %d\n", search_pc);
@@ -2438,7 +2437,6 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
 
     num_insns = 0;
     pc_start = tb->pc;
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
     ctx.pc = pc_start;
     ctx.saved_pc = -1;
     ctx.tb = tb;
@@ -2454,7 +2452,7 @@ gen_intermediate_code_internal(TriCoreCPU *cpu, struct TranslationBlock *tb,
 
         num_insns++;
 
-        if (tcg_ctx.gen_opc_ptr >= gen_opc_end) {
+        if (tcg_op_buf_full()) {
             gen_save_pc(ctx.next_pc);
             tcg_gen_exit_tb(0);
             break;
diff --git a/target-unicore32/translate.c b/target-unicore32/translate.c
index 04ee3c2..e95c1e2 100644
--- a/target-unicore32/translate.c
+++ b/target-unicore32/translate.c
@@ -1877,7 +1877,6 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
     CPUUniCore32State *env = &cpu->env;
     DisasContext dc1, *dc = &dc1;
     CPUBreakpoint *bp;
-    uint16_t *gen_opc_end;
     int j, lj;
     target_ulong pc_start;
     uint32_t next_page_start;
@@ -1891,8 +1890,6 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
 
     dc->tb = tb;
 
-    gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
-
     dc->is_jmp = DISAS_NEXT;
     dc->pc = pc_start;
     dc->singlestep_enabled = cs->singlestep_enabled;
@@ -1933,7 +1930,7 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
             }
         }
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j) {
@@ -1965,7 +1962,7 @@ static inline void gen_intermediate_code_internal(UniCore32CPU *cpu,
          * Also stop translation when a page boundary is reached.  This
          * ensures prefetch aborts occur at the right place.  */
         num_insns++;
-    } while (!dc->is_jmp && tcg_ctx.gen_opc_ptr < gen_opc_end &&
+    } while (!dc->is_jmp && !tcg_op_buf_full() &&
              !cs->singlestep_enabled &&
              !singlestep &&
              dc->pc < next_page_start &&
@@ -2047,7 +2044,7 @@ done_generating:
     }
 #endif
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         lj++;
         while (lj <= j) {
             tcg_ctx.gen_opc_instr_start[lj++] = 0;
diff --git a/target-xtensa/translate.c b/target-xtensa/translate.c
index 8e135fc..7ee565c 100644
--- a/target-xtensa/translate.c
+++ b/target-xtensa/translate.c
@@ -2987,7 +2987,6 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
     DisasContext dc;
     int insn_count = 0;
     int j, lj = -1;
-    uint16_t *gen_opc_end = tcg_ctx.gen_opc_buf + OPC_MAX_SIZE;
     int max_insns = tb->cflags & CF_COUNT_MASK;
     uint32_t pc_start = tb->pc;
     uint32_t next_page_start =
@@ -3030,7 +3029,7 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
         check_breakpoint(env, &dc);
 
         if (search_pc) {
-            j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+            j = tcg_op_buf_count();
             if (lj < j) {
                 lj++;
                 while (lj < j) {
@@ -3081,7 +3080,7 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
     } while (dc.is_jmp == DISAS_NEXT &&
             insn_count < max_insns &&
             dc.pc < next_page_start &&
-            tcg_ctx.gen_opc_ptr < gen_opc_end);
+            !tcg_op_buf_full());
 
     reset_litbase(&dc);
     reset_sar_tracker(&dc);
@@ -3107,7 +3106,7 @@ void gen_intermediate_code_internal(XtensaCPU *cpu,
     }
 #endif
     if (search_pc) {
-        j = tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+        j = tcg_op_buf_count();
         memset(tcg_ctx.gen_opc_instr_start + lj + 1, 0,
                 (j - lj) * sizeof(tcg_ctx.gen_opc_instr_start[0]));
     } else {
diff --git a/tcg/tcg.h b/tcg/tcg.h
index f4b9033..ff94931 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -537,6 +537,18 @@ struct TCGContext {
 
 extern TCGContext tcg_ctx;
 
+/* The number of opcodes emitted so far.  */
+static inline int tcg_op_buf_count(void)
+{
+    return tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+}
+
+/* Test for whether to terminate the TB for using too many opcodes.  */
+static inline bool tcg_op_buf_full(void)
+{
+    return tcg_op_buf_count() >= OPC_MAX_SIZE;
+}
+
 /* pool based memory allocation */
 
 void *tcg_malloc_internal(TCGContext *s, int size);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2.3 5/8] tcg: Put opcodes in a linked list
  2014-11-11 16:24 [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Richard Henderson
                   ` (3 preceding siblings ...)
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 4/8] tcg: Introduce tcg_op_buf_count and tcg_op_buf_full Richard Henderson
@ 2014-11-11 16:24 ` Richard Henderson
  2014-11-14 15:03   ` Bastian Koppelmann
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 6/8] tcg: Remove opcodes instead of noping them out Richard Henderson
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Richard Henderson @ 2014-11-11 16:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: aurelien

The previous setup required ops and args to be completely sequential,
and was error prone when it came to both iteration and optimization.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 include/exec/gen-icount.h |  22 ++-
 tcg/optimize.c            | 286 ++++++++++++++---------------------
 tcg/tcg-op.c              | 188 ++++++++++++-----------
 tcg/tcg.c                 | 376 +++++++++++++++++++---------------------------
 tcg/tcg.h                 |  58 ++++---
 5 files changed, 431 insertions(+), 499 deletions(-)

diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
index d5266ff..45cfdf7 100644
--- a/include/exec/gen-icount.h
+++ b/include/exec/gen-icount.h
@@ -11,8 +11,8 @@ static int exitreq_label;
 
 static inline void gen_tb_start(void)
 {
-    TCGv_i32 count;
-    TCGv_i32 flag;
+    TCGv_i32 count, flag, imm;
+    int i;
 
     exitreq_label = gen_new_label();
     flag = tcg_temp_new_i32();
@@ -21,16 +21,25 @@ static inline void gen_tb_start(void)
     tcg_gen_brcondi_i32(TCG_COND_NE, flag, 0, exitreq_label);
     tcg_temp_free_i32(flag);
 
-    if (!use_icount)
+    if (!use_icount) {
         return;
+    }
 
     icount_label = gen_new_label();
     count = tcg_temp_local_new_i32();
     tcg_gen_ld_i32(count, cpu_env,
                    -ENV_OFFSET + offsetof(CPUState, icount_decr.u32));
+
+    imm = tcg_temp_new_i32();
+    tcg_gen_movi_i32(imm, 0xdeadbeef);
+
     /* This is a horrid hack to allow fixing up the value later.  */
-    icount_arg = tcg_ctx.gen_opparam_ptr + 1;
-    tcg_gen_subi_i32(count, count, 0xdeadbeef);
+    i = tcg_ctx.gen_last_op_idx;
+    i = tcg_ctx.gen_op_buf[i].args;
+    icount_arg = &tcg_ctx.gen_opparam_buf[i + 1];
+
+    tcg_gen_sub_i32(count, count, imm);
+    tcg_temp_free_i32(imm);
 
     tcg_gen_brcondi_i32(TCG_COND_LT, count, 0, icount_label);
     tcg_gen_st16_i32(count, cpu_env,
@@ -49,7 +58,8 @@ static void gen_tb_end(TranslationBlock *tb, int num_insns)
         tcg_gen_exit_tb((uintptr_t)tb + TB_EXIT_ICOUNT_EXPIRED);
     }
 
-    *tcg_ctx.gen_opc_ptr = INDEX_op_end;
+    /* Terminate the linked list.  */
+    tcg_ctx.gen_op_buf[tcg_ctx.gen_last_op_idx].next = -1;
 }
 
 static inline void gen_io_start(void)
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 34ae3c2..f2b8acf 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -162,13 +162,13 @@ static bool temps_are_copies(TCGArg arg1, TCGArg arg2)
     return false;
 }
 
-static void tcg_opt_gen_mov(TCGContext *s, int op_index, TCGArg *gen_args,
+static void tcg_opt_gen_mov(TCGContext *s, TCGOp *op, TCGArg *args,
                             TCGOpcode old_op, TCGArg dst, TCGArg src)
 {
     TCGOpcode new_op = op_to_mov(old_op);
     tcg_target_ulong mask;
 
-    s->gen_opc_buf[op_index] = new_op;
+    op->opc = new_op;
 
     reset_temp(dst);
     mask = temps[src].mask;
@@ -193,17 +193,17 @@ static void tcg_opt_gen_mov(TCGContext *s, int op_index, TCGArg *gen_args,
         temps[src].next_copy = dst;
     }
 
-    gen_args[0] = dst;
-    gen_args[1] = src;
+    args[0] = dst;
+    args[1] = src;
 }
 
-static void tcg_opt_gen_movi(TCGContext *s, int op_index, TCGArg *gen_args,
+static void tcg_opt_gen_movi(TCGContext *s, TCGOp *op, TCGArg *args,
                              TCGOpcode old_op, TCGArg dst, TCGArg val)
 {
     TCGOpcode new_op = op_to_movi(old_op);
     tcg_target_ulong mask;
 
-    s->gen_opc_buf[op_index] = new_op;
+    op->opc = new_op;
 
     reset_temp(dst);
     temps[dst].state = TCG_TEMP_CONST;
@@ -215,8 +215,8 @@ static void tcg_opt_gen_movi(TCGContext *s, int op_index, TCGArg *gen_args,
     }
     temps[dst].mask = mask;
 
-    gen_args[0] = dst;
-    gen_args[1] = val;
+    args[0] = dst;
+    args[1] = val;
 }
 
 static TCGArg do_constant_folding_2(TCGOpcode op, TCGArg x, TCGArg y)
@@ -533,11 +533,9 @@ static bool swap_commutative2(TCGArg *p1, TCGArg *p2)
 }
 
 /* Propagate constants and copies, fold constant expressions. */
-static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
-                                    TCGArg *args, TCGOpDef *tcg_op_defs)
+static void tcg_constant_folding(TCGContext *s)
 {
-    int nb_ops, op_index, nb_temps, nb_globals;
-    TCGArg *gen_args;
+    int oi, oi_next, nb_temps, nb_globals;
 
     /* Array VALS has an element for each temp.
        If this temp holds a constant then its value is kept in VALS' element.
@@ -548,24 +546,23 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
     nb_globals = s->nb_globals;
     reset_all_temps(nb_temps);
 
-    nb_ops = tcg_opc_ptr - s->gen_opc_buf;
-    gen_args = args;
-    for (op_index = 0; op_index < nb_ops; op_index++) {
-        TCGOpcode op = s->gen_opc_buf[op_index];
-        const TCGOpDef *def = &tcg_op_defs[op];
+    for (oi = s->gen_first_op_idx; oi >= 0; oi = oi_next) {
         tcg_target_ulong mask, partmask, affected;
-        int nb_oargs, nb_iargs, nb_args, i;
+        int nb_oargs, nb_iargs, i;
         TCGArg tmp;
 
-        if (op == INDEX_op_call) {
-            *gen_args++ = tmp = *args++;
-            nb_oargs = tmp >> 16;
-            nb_iargs = tmp & 0xffff;
-            nb_args = nb_oargs + nb_iargs + def->nb_cargs;
+        TCGOp * const op = &s->gen_op_buf[oi];
+        TCGArg * const args = &s->gen_opparam_buf[op->args];
+        TCGOpcode opc = op->opc;
+        const TCGOpDef *def = &tcg_op_defs[opc];
+
+        oi_next = op->next;
+        if (opc == INDEX_op_call) {
+            nb_oargs = op->callo;
+            nb_iargs = op->calli;
         } else {
             nb_oargs = def->nb_oargs;
             nb_iargs = def->nb_iargs;
-            nb_args = def->nb_args;
         }
 
         /* Do copy propagation */
@@ -576,7 +573,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
         }
 
         /* For commutative operations make constant second argument */
-        switch (op) {
+        switch (opc) {
         CASE_OP_32_64(add):
         CASE_OP_32_64(mul):
         CASE_OP_32_64(and):
@@ -634,7 +631,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
 
         /* Simplify expressions for "shift/rot r, 0, a => movi r, 0",
            and "sub r, 0, a => neg r, a" case.  */
-        switch (op) {
+        switch (opc) {
         CASE_OP_32_64(shl):
         CASE_OP_32_64(shr):
         CASE_OP_32_64(sar):
@@ -642,9 +639,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
         CASE_OP_32_64(rotr):
             if (temps[args[1]].state == TCG_TEMP_CONST
                 && temps[args[1]].val == 0) {
-                tcg_opt_gen_movi(s, op_index, gen_args, op, args[0], 0);
-                args += 3;
-                gen_args += 2;
+                tcg_opt_gen_movi(s, op, args, opc, args[0], 0);
                 continue;
             }
             break;
@@ -657,7 +652,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                     /* Proceed with possible constant folding. */
                     break;
                 }
-                if (op == INDEX_op_sub_i32) {
+                if (opc == INDEX_op_sub_i32) {
                     neg_op = INDEX_op_neg_i32;
                     have_neg = TCG_TARGET_HAS_neg_i32;
                 } else {
@@ -669,12 +664,9 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                 }
                 if (temps[args[1]].state == TCG_TEMP_CONST
                     && temps[args[1]].val == 0) {
-                    s->gen_opc_buf[op_index] = neg_op;
+                    op->opc = neg_op;
                     reset_temp(args[0]);
-                    gen_args[0] = args[0];
-                    gen_args[1] = args[2];
-                    args += 3;
-                    gen_args += 2;
+                    args[1] = args[2];
                     continue;
                 }
             }
@@ -728,12 +720,9 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                 if (!have_not) {
                     break;
                 }
-                s->gen_opc_buf[op_index] = not_op;
+                op->opc = not_op;
                 reset_temp(args[0]);
-                gen_args[0] = args[0];
-                gen_args[1] = args[i];
-                args += 3;
-                gen_args += 2;
+                args[1] = args[i];
                 continue;
             }
         default:
@@ -741,7 +730,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
         }
 
         /* Simplify expression for "op r, a, const => mov r, a" cases */
-        switch (op) {
+        switch (opc) {
         CASE_OP_32_64(add):
         CASE_OP_32_64(sub):
         CASE_OP_32_64(shl):
@@ -769,12 +758,10 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             break;
         do_mov3:
             if (temps_are_copies(args[0], args[1])) {
-                s->gen_opc_buf[op_index] = INDEX_op_nop;
+                op->opc = INDEX_op_nop;
             } else {
-                tcg_opt_gen_mov(s, op_index, gen_args, op, args[0], args[1]);
-                gen_args += 2;
+                tcg_opt_gen_mov(s, op, args, opc, args[0], args[1]);
             }
-            args += 3;
             continue;
         default:
             break;
@@ -784,7 +771,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
            output argument is supported. */
         mask = -1;
         affected = -1;
-        switch (op) {
+        switch (opc) {
         CASE_OP_32_64(ext8s):
             if ((temps[args[1]].mask & 0x80) != 0) {
                 break;
@@ -923,38 +910,31 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
 
         if (partmask == 0) {
             assert(nb_oargs == 1);
-            tcg_opt_gen_movi(s, op_index, gen_args, op, args[0], 0);
-            args += nb_args;
-            gen_args += 2;
+            tcg_opt_gen_movi(s, op, args, opc, args[0], 0);
             continue;
         }
         if (affected == 0) {
             assert(nb_oargs == 1);
             if (temps_are_copies(args[0], args[1])) {
-                s->gen_opc_buf[op_index] = INDEX_op_nop;
+                op->opc = INDEX_op_nop;
             } else if (temps[args[1]].state != TCG_TEMP_CONST) {
-                tcg_opt_gen_mov(s, op_index, gen_args, op, args[0], args[1]);
-                gen_args += 2;
+                tcg_opt_gen_mov(s, op, args, opc, args[0], args[1]);
             } else {
-                tcg_opt_gen_movi(s, op_index, gen_args, op,
+                tcg_opt_gen_movi(s, op, args, opc,
                                  args[0], temps[args[1]].val);
-                gen_args += 2;
             }
-            args += nb_args;
             continue;
         }
 
         /* Simplify expression for "op r, a, 0 => movi r, 0" cases */
-        switch (op) {
+        switch (opc) {
         CASE_OP_32_64(and):
         CASE_OP_32_64(mul):
         CASE_OP_32_64(muluh):
         CASE_OP_32_64(mulsh):
             if ((temps[args[2]].state == TCG_TEMP_CONST
                 && temps[args[2]].val == 0)) {
-                tcg_opt_gen_movi(s, op_index, gen_args, op, args[0], 0);
-                args += 3;
-                gen_args += 2;
+                tcg_opt_gen_movi(s, op, args, opc, args[0], 0);
                 continue;
             }
             break;
@@ -963,18 +943,15 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
         }
 
         /* Simplify expression for "op r, a, a => mov r, a" cases */
-        switch (op) {
+        switch (opc) {
         CASE_OP_32_64(or):
         CASE_OP_32_64(and):
             if (temps_are_copies(args[1], args[2])) {
                 if (temps_are_copies(args[0], args[1])) {
-                    s->gen_opc_buf[op_index] = INDEX_op_nop;
+                    op->opc = INDEX_op_nop;
                 } else {
-                    tcg_opt_gen_mov(s, op_index, gen_args, op,
-                                    args[0], args[1]);
-                    gen_args += 2;
+                    tcg_opt_gen_mov(s, op, args, opc, args[0], args[1]);
                 }
-                args += 3;
                 continue;
             }
             break;
@@ -983,14 +960,12 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
         }
 
         /* Simplify expression for "op r, a, a => movi r, 0" cases */
-        switch (op) {
+        switch (opc) {
         CASE_OP_32_64(andc):
         CASE_OP_32_64(sub):
         CASE_OP_32_64(xor):
             if (temps_are_copies(args[1], args[2])) {
-                tcg_opt_gen_movi(s, op_index, gen_args, op, args[0], 0);
-                gen_args += 2;
-                args += 3;
+                tcg_opt_gen_movi(s, op, args, opc, args[0], 0);
                 continue;
             }
             break;
@@ -1001,17 +976,14 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
         /* Propagate constants through copy operations and do constant
            folding.  Constants will be substituted to arguments by register
            allocator where needed and possible.  Also detect copies. */
-        switch (op) {
+        switch (opc) {
         CASE_OP_32_64(mov):
             if (temps_are_copies(args[0], args[1])) {
-                args += 2;
-                s->gen_opc_buf[op_index] = INDEX_op_nop;
+                op->opc = INDEX_op_nop;
                 break;
             }
             if (temps[args[1]].state != TCG_TEMP_CONST) {
-                tcg_opt_gen_mov(s, op_index, gen_args, op, args[0], args[1]);
-                gen_args += 2;
-                args += 2;
+                tcg_opt_gen_mov(s, op, args, opc, args[0], args[1]);
                 break;
             }
             /* Source argument is constant.  Rewrite the operation and
@@ -1019,9 +991,7 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             args[1] = temps[args[1]].val;
             /* fallthrough */
         CASE_OP_32_64(movi):
-            tcg_opt_gen_movi(s, op_index, gen_args, op, args[0], args[1]);
-            gen_args += 2;
-            args += 2;
+            tcg_opt_gen_movi(s, op, args, opc, args[0], args[1]);
             break;
 
         CASE_OP_32_64(not):
@@ -1033,20 +1003,16 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
         case INDEX_op_ext32s_i64:
         case INDEX_op_ext32u_i64:
             if (temps[args[1]].state == TCG_TEMP_CONST) {
-                tmp = do_constant_folding(op, temps[args[1]].val, 0);
-                tcg_opt_gen_movi(s, op_index, gen_args, op, args[0], tmp);
-                gen_args += 2;
-                args += 2;
+                tmp = do_constant_folding(opc, temps[args[1]].val, 0);
+                tcg_opt_gen_movi(s, op, args, opc, args[0], tmp);
                 break;
             }
             goto do_default;
 
         case INDEX_op_trunc_shr_i32:
             if (temps[args[1]].state == TCG_TEMP_CONST) {
-                tmp = do_constant_folding(op, temps[args[1]].val, args[2]);
-                tcg_opt_gen_movi(s, op_index, gen_args, op, args[0], tmp);
-                gen_args += 2;
-                args += 3;
+                tmp = do_constant_folding(opc, temps[args[1]].val, args[2]);
+                tcg_opt_gen_movi(s, op, args, opc, args[0], tmp);
                 break;
             }
             goto do_default;
@@ -1075,11 +1041,9 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
         CASE_OP_32_64(remu):
             if (temps[args[1]].state == TCG_TEMP_CONST
                 && temps[args[2]].state == TCG_TEMP_CONST) {
-                tmp = do_constant_folding(op, temps[args[1]].val,
+                tmp = do_constant_folding(opc, temps[args[1]].val,
                                           temps[args[2]].val);
-                tcg_opt_gen_movi(s, op_index, gen_args, op, args[0], tmp);
-                gen_args += 2;
-                args += 3;
+                tcg_opt_gen_movi(s, op, args, opc, args[0], tmp);
                 break;
             }
             goto do_default;
@@ -1089,54 +1053,44 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                 && temps[args[2]].state == TCG_TEMP_CONST) {
                 tmp = deposit64(temps[args[1]].val, args[3], args[4],
                                 temps[args[2]].val);
-                tcg_opt_gen_movi(s, op_index, gen_args, op, args[0], tmp);
-                gen_args += 2;
-                args += 5;
+                tcg_opt_gen_movi(s, op, args, opc, args[0], tmp);
                 break;
             }
             goto do_default;
 
         CASE_OP_32_64(setcond):
-            tmp = do_constant_folding_cond(op, args[1], args[2], args[3]);
+            tmp = do_constant_folding_cond(opc, args[1], args[2], args[3]);
             if (tmp != 2) {
-                tcg_opt_gen_movi(s, op_index, gen_args, op, args[0], tmp);
-                gen_args += 2;
-                args += 4;
+                tcg_opt_gen_movi(s, op, args, opc, args[0], tmp);
                 break;
             }
             goto do_default;
 
         CASE_OP_32_64(brcond):
-            tmp = do_constant_folding_cond(op, args[0], args[1], args[2]);
+            tmp = do_constant_folding_cond(opc, args[0], args[1], args[2]);
             if (tmp != 2) {
                 if (tmp) {
                     reset_all_temps(nb_temps);
-                    s->gen_opc_buf[op_index] = INDEX_op_br;
-                    gen_args[0] = args[3];
-                    gen_args += 1;
+                    op->opc = INDEX_op_br;
+                    args[0] = args[3];
                 } else {
-                    s->gen_opc_buf[op_index] = INDEX_op_nop;
+                    op->opc = INDEX_op_nop;
                 }
-                args += 4;
                 break;
             }
             goto do_default;
 
         CASE_OP_32_64(movcond):
-            tmp = do_constant_folding_cond(op, args[1], args[2], args[5]);
+            tmp = do_constant_folding_cond(opc, args[1], args[2], args[5]);
             if (tmp != 2) {
                 if (temps_are_copies(args[0], args[4-tmp])) {
-                    s->gen_opc_buf[op_index] = INDEX_op_nop;
+                    op->opc = INDEX_op_nop;
                 } else if (temps[args[4-tmp]].state == TCG_TEMP_CONST) {
-                    tcg_opt_gen_movi(s, op_index, gen_args, op,
+                    tcg_opt_gen_movi(s, op, args, opc,
                                      args[0], temps[args[4-tmp]].val);
-                    gen_args += 2;
                 } else {
-                    tcg_opt_gen_mov(s, op_index, gen_args, op,
-                                    args[0], args[4-tmp]);
-                    gen_args += 2;
+                    tcg_opt_gen_mov(s, op, args, opc, args[0], args[4-tmp]);
                 }
-                args += 6;
                 break;
             }
             goto do_default;
@@ -1154,24 +1108,31 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                 uint64_t a = ((uint64_t)ah << 32) | al;
                 uint64_t b = ((uint64_t)bh << 32) | bl;
                 TCGArg rl, rh;
+                TCGOp *op2;
+                TCGArg *args2;
 
-                if (op == INDEX_op_add2_i32) {
+                if (opc == INDEX_op_add2_i32) {
                     a += b;
                 } else {
                     a -= b;
                 }
 
                 /* We emit the extra nop when we emit the add2/sub2.  */
-                assert(s->gen_opc_buf[op_index + 1] == INDEX_op_nop);
+                op2 = &s->gen_op_buf[oi_next];
+                assert(op2->opc == INDEX_op_nop);
+
+                /* But we still have to allocate args for the op.  */
+                op2->args = s->gen_next_parm_idx;
+                s->gen_next_parm_idx += 2;
+                args2 = &s->gen_opparam_buf[op2->args];
 
                 rl = args[0];
                 rh = args[1];
-                tcg_opt_gen_movi(s, op_index, &gen_args[0],
-                                 op, rl, (uint32_t)a);
-                tcg_opt_gen_movi(s, ++op_index, &gen_args[2],
-                                 op, rh, (uint32_t)(a >> 32));
-                gen_args += 4;
-                args += 6;
+                tcg_opt_gen_movi(s, op, args, opc, rl, (uint32_t)a);
+                tcg_opt_gen_movi(s, op2, args2, opc, rh, (uint32_t)(a >> 32));
+
+                /* We've done all we need to do with the movi.  Skip it.  */
+                oi_next = op2->next;
                 break;
             }
             goto do_default;
@@ -1183,18 +1144,25 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                 uint32_t b = temps[args[3]].val;
                 uint64_t r = (uint64_t)a * b;
                 TCGArg rl, rh;
+                TCGOp *op2;
+                TCGArg *args2;
 
                 /* We emit the extra nop when we emit the mulu2.  */
-                assert(s->gen_opc_buf[op_index + 1] == INDEX_op_nop);
+                op2 = &s->gen_op_buf[oi_next];
+                assert(op2->opc == INDEX_op_nop);
+
+                /* But we still have to allocate args for the op.  */
+                op2->args = s->gen_next_parm_idx;
+                s->gen_next_parm_idx += 2;
+                args2 = &s->gen_opparam_buf[op2->args];
 
                 rl = args[0];
                 rh = args[1];
-                tcg_opt_gen_movi(s, op_index, &gen_args[0],
-                                 op, rl, (uint32_t)r);
-                tcg_opt_gen_movi(s, ++op_index, &gen_args[2],
-                                 op, rh, (uint32_t)(r >> 32));
-                gen_args += 4;
-                args += 4;
+                tcg_opt_gen_movi(s, op, args, opc, rl, (uint32_t)r);
+                tcg_opt_gen_movi(s, op2, args2, opc, rh, (uint32_t)(r >> 32));
+
+                /* We've done all we need to do with the movi.  Skip it.  */
+                oi_next = op2->next;
                 break;
             }
             goto do_default;
@@ -1205,12 +1173,11 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                 if (tmp) {
             do_brcond_true:
                     reset_all_temps(nb_temps);
-                    s->gen_opc_buf[op_index] = INDEX_op_br;
-                    gen_args[0] = args[5];
-                    gen_args += 1;
+                    op->opc = INDEX_op_br;
+                    args[0] = args[5];
                 } else {
             do_brcond_false:
-                    s->gen_opc_buf[op_index] = INDEX_op_nop;
+                    op->opc = INDEX_op_nop;
                 }
             } else if ((args[4] == TCG_COND_LT || args[4] == TCG_COND_GE)
                        && temps[args[2]].state == TCG_TEMP_CONST
@@ -1221,12 +1188,11 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                    vs the high word of the input.  */
             do_brcond_high:
                 reset_all_temps(nb_temps);
-                s->gen_opc_buf[op_index] = INDEX_op_brcond_i32;
-                gen_args[0] = args[1];
-                gen_args[1] = args[3];
-                gen_args[2] = args[4];
-                gen_args[3] = args[5];
-                gen_args += 4;
+                op->opc = INDEX_op_brcond_i32;
+                args[0] = args[1];
+                args[1] = args[3];
+                args[2] = args[4];
+                args[3] = args[5];
             } else if (args[4] == TCG_COND_EQ) {
                 /* Simplify EQ comparisons where one of the pairs
                    can be simplified.  */
@@ -1246,12 +1212,10 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                 }
             do_brcond_low:
                 reset_all_temps(nb_temps);
-                s->gen_opc_buf[op_index] = INDEX_op_brcond_i32;
-                gen_args[0] = args[0];
-                gen_args[1] = args[2];
-                gen_args[2] = args[4];
-                gen_args[3] = args[5];
-                gen_args += 4;
+                op->opc = INDEX_op_brcond_i32;
+                args[1] = args[2];
+                args[2] = args[4];
+                args[3] = args[5];
             } else if (args[4] == TCG_COND_NE) {
                 /* Simplify NE comparisons where one of the pairs
                    can be simplified.  */
@@ -1273,15 +1237,13 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             } else {
                 goto do_default;
             }
-            args += 6;
             break;
 
         case INDEX_op_setcond2_i32:
             tmp = do_constant_folding_cond2(&args[1], &args[3], args[5]);
             if (tmp != 2) {
             do_setcond_const:
-                tcg_opt_gen_movi(s, op_index, gen_args, op, args[0], tmp);
-                gen_args += 2;
+                tcg_opt_gen_movi(s, op, args, opc, args[0], tmp);
             } else if ((args[5] == TCG_COND_LT || args[5] == TCG_COND_GE)
                        && temps[args[3]].state == TCG_TEMP_CONST
                        && temps[args[4]].state == TCG_TEMP_CONST
@@ -1290,14 +1252,12 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                 /* Simplify LT/GE comparisons vs zero to a single compare
                    vs the high word of the input.  */
             do_setcond_high:
-                s->gen_opc_buf[op_index] = INDEX_op_setcond_i32;
                 reset_temp(args[0]);
                 temps[args[0]].mask = 1;
-                gen_args[0] = args[0];
-                gen_args[1] = args[2];
-                gen_args[2] = args[4];
-                gen_args[3] = args[5];
-                gen_args += 4;
+                op->opc = INDEX_op_setcond_i32;
+                args[1] = args[2];
+                args[2] = args[4];
+                args[3] = args[5];
             } else if (args[5] == TCG_COND_EQ) {
                 /* Simplify EQ comparisons where one of the pairs
                    can be simplified.  */
@@ -1318,12 +1278,9 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             do_setcond_low:
                 reset_temp(args[0]);
                 temps[args[0]].mask = 1;
-                s->gen_opc_buf[op_index] = INDEX_op_setcond_i32;
-                gen_args[0] = args[0];
-                gen_args[1] = args[1];
-                gen_args[2] = args[3];
-                gen_args[3] = args[5];
-                gen_args += 4;
+                op->opc = INDEX_op_setcond_i32;
+                args[2] = args[3];
+                args[3] = args[5];
             } else if (args[5] == TCG_COND_NE) {
                 /* Simplify NE comparisons where one of the pairs
                    can be simplified.  */
@@ -1345,7 +1302,6 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
             } else {
                 goto do_default;
             }
-            args += 6;
             break;
 
         case INDEX_op_call:
@@ -1377,22 +1333,12 @@ static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
                     }
                 }
             }
-            for (i = 0; i < nb_args; i++) {
-                gen_args[i] = args[i];
-            }
-            args += nb_args;
-            gen_args += nb_args;
             break;
         }
     }
-
-    return gen_args;
 }
 
-TCGArg *tcg_optimize(TCGContext *s, uint16_t *tcg_opc_ptr,
-        TCGArg *args, TCGOpDef *tcg_op_defs)
+void tcg_optimize(TCGContext *s)
 {
-    TCGArg *res;
-    res = tcg_constant_folding(s, tcg_opc_ptr, args, tcg_op_defs);
-    return res;
+    tcg_constant_folding(s);
 }
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index 5305f1d..fbd82bd 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -35,100 +35,116 @@ extern TCGv_i32 TCGV_HIGH_link_error(TCGv_i64);
 #define TCGV_HIGH TCGV_HIGH_link_error
 #endif
 
+/* Note that this is optimized for sequential allocation during translate.
+   Up to and including filling in the forward link immediately.  We'll do
+   proper termination of the end of the list after we finish translation.  */
+
+static void tcg_gen_op_begin(TCGContext *ctx, TCGOpcode opc, int args)
+{
+    int oi = ctx->gen_next_op_idx;
+    int ni = oi + 1;
+    int pi = oi - 1;
+
+    tcg_debug_assert(oi < OPC_BUF_SIZE);
+    ctx->gen_last_op_idx = oi;
+    ctx->gen_next_op_idx = ni;
+
+    ctx->gen_op_buf[oi] = (TCGOp){
+        .opc = opc,
+        .args = args,
+        .prev = pi,
+        .next = ni
+    };
+}
+
 void tcg_gen_op0(TCGContext *ctx, TCGOpcode opc)
 {
-    *ctx->gen_opc_ptr++ = opc;
+    tcg_gen_op_begin(ctx, opc, -1);
 }
 
 void tcg_gen_op1(TCGContext *ctx, TCGOpcode opc, TCGArg a1)
 {
-    uint16_t *op = ctx->gen_opc_ptr;
-    TCGArg *opp = ctx->gen_opparam_ptr;
+    int pi = ctx->gen_next_parm_idx;
 
-    op[0] = opc;
-    opp[0] = a1;
+    tcg_debug_assert(pi + 1 <= OPPARAM_BUF_SIZE);
+    ctx->gen_next_parm_idx = pi + 1;
+    ctx->gen_opparam_buf[pi] = a1;
 
-    ctx->gen_opc_ptr = op + 1;
-    ctx->gen_opparam_ptr = opp + 1;
+    tcg_gen_op_begin(ctx, opc, pi);
 }
 
 void tcg_gen_op2(TCGContext *ctx, TCGOpcode opc, TCGArg a1, TCGArg a2)
 {
-    uint16_t *op = ctx->gen_opc_ptr;
-    TCGArg *opp = ctx->gen_opparam_ptr;
+    int pi = ctx->gen_next_parm_idx;
 
-    op[0] = opc;
-    opp[0] = a1;
-    opp[1] = a2;
+    tcg_debug_assert(pi + 2 <= OPPARAM_BUF_SIZE);
+    ctx->gen_next_parm_idx = pi + 2;
+    ctx->gen_opparam_buf[pi + 0] = a1;
+    ctx->gen_opparam_buf[pi + 1] = a2;
 
-    ctx->gen_opc_ptr = op + 1;
-    ctx->gen_opparam_ptr = opp + 2;
+    tcg_gen_op_begin(ctx, opc, pi);
 }
 
 void tcg_gen_op3(TCGContext *ctx, TCGOpcode opc, TCGArg a1,
                  TCGArg a2, TCGArg a3)
 {
-    uint16_t *op = ctx->gen_opc_ptr;
-    TCGArg *opp = ctx->gen_opparam_ptr;
+    int pi = ctx->gen_next_parm_idx;
 
-    op[0] = opc;
-    opp[0] = a1;
-    opp[1] = a2;
-    opp[2] = a3;
+    tcg_debug_assert(pi + 3 <= OPPARAM_BUF_SIZE);
+    ctx->gen_next_parm_idx = pi + 3;
+    ctx->gen_opparam_buf[pi + 0] = a1;
+    ctx->gen_opparam_buf[pi + 1] = a2;
+    ctx->gen_opparam_buf[pi + 2] = a3;
 
-    ctx->gen_opc_ptr = op + 1;
-    ctx->gen_opparam_ptr = opp + 3;
+    tcg_gen_op_begin(ctx, opc, pi);
 }
 
 void tcg_gen_op4(TCGContext *ctx, TCGOpcode opc, TCGArg a1,
                  TCGArg a2, TCGArg a3, TCGArg a4)
 {
-    uint16_t *op = ctx->gen_opc_ptr;
-    TCGArg *opp = ctx->gen_opparam_ptr;
+    int pi = ctx->gen_next_parm_idx;
 
-    op[0] = opc;
-    opp[0] = a1;
-    opp[1] = a2;
-    opp[2] = a3;
-    opp[3] = a4;
+    tcg_debug_assert(pi + 4 <= OPPARAM_BUF_SIZE);
+    ctx->gen_next_parm_idx = pi + 4;
+    ctx->gen_opparam_buf[pi + 0] = a1;
+    ctx->gen_opparam_buf[pi + 1] = a2;
+    ctx->gen_opparam_buf[pi + 2] = a3;
+    ctx->gen_opparam_buf[pi + 3] = a4;
 
-    ctx->gen_opc_ptr = op + 1;
-    ctx->gen_opparam_ptr = opp + 4;
+    tcg_gen_op_begin(ctx, opc, pi);
 }
 
 void tcg_gen_op5(TCGContext *ctx, TCGOpcode opc, TCGArg a1,
                  TCGArg a2, TCGArg a3, TCGArg a4, TCGArg a5)
 {
-    uint16_t *op = ctx->gen_opc_ptr;
-    TCGArg *opp = ctx->gen_opparam_ptr;
+    int pi = ctx->gen_next_parm_idx;
 
-    op[0] = opc;
-    opp[0] = a1;
-    opp[1] = a2;
-    opp[2] = a3;
-    opp[3] = a4;
-    opp[4] = a5;
+    tcg_debug_assert(pi + 5 <= OPPARAM_BUF_SIZE);
+    ctx->gen_next_parm_idx = pi + 5;
+    ctx->gen_opparam_buf[pi + 0] = a1;
+    ctx->gen_opparam_buf[pi + 1] = a2;
+    ctx->gen_opparam_buf[pi + 2] = a3;
+    ctx->gen_opparam_buf[pi + 3] = a4;
+    ctx->gen_opparam_buf[pi + 4] = a5;
 
-    ctx->gen_opc_ptr = op + 1;
-    ctx->gen_opparam_ptr = opp + 5;
+    tcg_gen_op_begin(ctx, opc, pi);
 }
 
 void tcg_gen_op6(TCGContext *ctx, TCGOpcode opc, TCGArg a1, TCGArg a2,
                  TCGArg a3, TCGArg a4, TCGArg a5, TCGArg a6)
 {
-    uint16_t *op = ctx->gen_opc_ptr;
-    TCGArg *opp = ctx->gen_opparam_ptr;
+    int pi = ctx->gen_next_parm_idx;
 
-    op[0] = opc;
-    opp[0] = a1;
-    opp[1] = a2;
-    opp[2] = a3;
-    opp[3] = a4;
-    opp[4] = a5;
-    opp[5] = a6;
+    tcg_debug_assert(pi + 6 <= OPPARAM_BUF_SIZE);
+    ctx->gen_next_parm_idx = pi + 6;
+    ctx->gen_opparam_buf[pi + 0] = a1;
+    ctx->gen_opparam_buf[pi + 1] = a2;
+    ctx->gen_opparam_buf[pi + 2] = a3;
+    ctx->gen_opparam_buf[pi + 3] = a4;
+    ctx->gen_opparam_buf[pi + 4] = a5;
+    ctx->gen_opparam_buf[pi + 5] = a6;
 
-    ctx->gen_opc_ptr = op + 1;
-    ctx->gen_opparam_ptr = opp + 6;
+    tcg_gen_op_begin(ctx, opc, pi);
 }
 
 /* 32 bit ops */
@@ -1862,47 +1878,53 @@ static inline TCGMemOp tcg_canonicalize_memop(TCGMemOp op, bool is64, bool st)
     return op;
 }
 
-static inline void tcg_add_param_i32(TCGv_i32 val)
-{
-    *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(val);
-}
-
-static inline void tcg_add_param_i64(TCGv_i64 val)
+static void gen_ldst_i32(TCGOpcode opc, TCGv_i32 val, TCGv addr,
+                         TCGMemOp memop, TCGArg idx)
 {
+#if TARGET_LONG_BITS == 32
+    tcg_gen_op4ii_i32(opc, val, addr, memop, idx);
+#else
     if (TCG_TARGET_REG_BITS == 32) {
-        *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(TCGV_LOW(val));
-        *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I32(TCGV_HIGH(val));
+        tcg_gen_op5ii_i32(opc, val, TCGV_LOW(addr), TCGV_HIGH(addr),
+                          memop, idx);
     } else {
-        *tcg_ctx.gen_opparam_ptr++ = GET_TCGV_I64(val);
+        tcg_gen_op4(&tcg_ctx, opc, GET_TCGV_I32(val), GET_TCGV_I64(addr),
+                    memop, idx);
     }
+#endif
 }
 
+static void gen_ldst_i64(TCGOpcode opc, TCGv_i64 val, TCGv addr,
+                         TCGMemOp memop, TCGArg idx)
+{
 #if TARGET_LONG_BITS == 32
-# define tcg_add_param_tl  tcg_add_param_i32
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_op5ii_i32(opc, TCGV_LOW(val), TCGV_HIGH(val),
+                          addr, memop, idx);
+    } else {
+        tcg_gen_op4(&tcg_ctx, opc, GET_TCGV_I64(val), GET_TCGV_I32(addr),
+                    memop, idx);
+    }
 #else
-# define tcg_add_param_tl  tcg_add_param_i64
+    if (TCG_TARGET_REG_BITS == 32) {
+        tcg_gen_op6ii_i32(opc, TCGV_LOW(val), TCGV_HIGH(val),
+                          TCGV_LOW(addr), TCGV_HIGH(addr), memop, idx);
+    } else {
+        tcg_gen_op4ii_i64(opc, val, addr, memop, idx);
+    }
 #endif
+}
 
 void tcg_gen_qemu_ld_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 {
     memop = tcg_canonicalize_memop(memop, 0, 0);
-
-    *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_ld_i32;
-    tcg_add_param_i32(val);
-    tcg_add_param_tl(addr);
-    *tcg_ctx.gen_opparam_ptr++ = memop;
-    *tcg_ctx.gen_opparam_ptr++ = idx;
+    gen_ldst_i32(INDEX_op_qemu_ld_i32, val, addr, memop, idx);
 }
 
 void tcg_gen_qemu_st_i32(TCGv_i32 val, TCGv addr, TCGArg idx, TCGMemOp memop)
 {
     memop = tcg_canonicalize_memop(memop, 0, 1);
-
-    *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_st_i32;
-    tcg_add_param_i32(val);
-    tcg_add_param_tl(addr);
-    *tcg_ctx.gen_opparam_ptr++ = memop;
-    *tcg_ctx.gen_opparam_ptr++ = idx;
+    gen_ldst_i32(INDEX_op_qemu_st_i32, val, addr, memop, idx);
 }
 
 void tcg_gen_qemu_ld_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
@@ -1910,7 +1932,7 @@ void tcg_gen_qemu_ld_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
     memop = tcg_canonicalize_memop(memop, 1, 0);
 
     if (TCG_TARGET_REG_BITS == 32 && (memop & MO_SIZE) < MO_64) {
-        tcg_gen_qemu_ld_i32(TCGV_LOW(val), addr, idx, memop);
+        gen_ldst_i32(INDEX_op_qemu_ld_i32, TCGV_LOW(val), addr, idx, memop);
         if (memop & MO_SIGN) {
             tcg_gen_sari_i32(TCGV_HIGH(val), TCGV_LOW(val), 31);
         } else {
@@ -1919,11 +1941,7 @@ void tcg_gen_qemu_ld_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
         return;
     }
 
-    *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_ld_i64;
-    tcg_add_param_i64(val);
-    tcg_add_param_tl(addr);
-    *tcg_ctx.gen_opparam_ptr++ = memop;
-    *tcg_ctx.gen_opparam_ptr++ = idx;
+    gen_ldst_i64(INDEX_op_qemu_ld_i64, val, addr, memop, idx);
 }
 
 void tcg_gen_qemu_st_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
@@ -1931,13 +1949,9 @@ void tcg_gen_qemu_st_i64(TCGv_i64 val, TCGv addr, TCGArg idx, TCGMemOp memop)
     memop = tcg_canonicalize_memop(memop, 1, 1);
 
     if (TCG_TARGET_REG_BITS == 32 && (memop & MO_SIZE) < MO_64) {
-        tcg_gen_qemu_st_i32(TCGV_LOW(val), addr, idx, memop);
+        gen_ldst_i32(INDEX_op_qemu_st_i32, TCGV_LOW(val), addr, idx, memop);
         return;
     }
 
-    *tcg_ctx.gen_opc_ptr++ = INDEX_op_qemu_st_i64;
-    tcg_add_param_i64(val);
-    tcg_add_param_tl(addr);
-    *tcg_ctx.gen_opparam_ptr++ = memop;
-    *tcg_ctx.gen_opparam_ptr++ = idx;
+    gen_ldst_i64(INDEX_op_qemu_st_i64, val, addr, memop, idx);
 }
diff --git a/tcg/tcg.c b/tcg/tcg.c
index ae9811f..f5f29d7 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -407,7 +407,6 @@ void tcg_func_start(TCGContext *s)
     /* No temps have been previously allocated for size or locality.  */
     memset(s->free_temps, 0, sizeof(s->free_temps));
 
-    s->labels = tcg_malloc(sizeof(TCGLabel) * TCG_MAX_LABELS);
     s->nb_labels = 0;
     s->current_frame_offset = s->frame_start;
 
@@ -415,8 +414,10 @@ void tcg_func_start(TCGContext *s)
     s->goto_tb_issue_mask = 0;
 #endif
 
-    s->gen_opc_ptr = s->gen_opc_buf;
-    s->gen_opparam_ptr = s->gen_opparam_buf;
+    s->gen_first_op_idx = 0;
+    s->gen_last_op_idx = -1;
+    s->gen_next_op_idx = 0;
+    s->gen_next_parm_idx = 0;
 
     s->be = tcg_malloc(sizeof(TCGBackendData));
 }
@@ -703,9 +704,8 @@ int tcg_check_temp_count(void)
 void tcg_gen_callN(TCGContext *s, void *func, TCGArg ret,
                    int nargs, TCGArg *args)
 {
-    int i, real_args, nb_rets;
+    int i, real_args, nb_rets, pi, pi_first;
     unsigned sizemask, flags;
-    TCGArg *nparam;
     TCGHelperInfo *info;
 
     info = g_hash_table_lookup(s->helpers, (gpointer)func);
@@ -758,8 +758,7 @@ void tcg_gen_callN(TCGContext *s, void *func, TCGArg ret,
     }
 #endif /* TCG_TARGET_EXTEND_ARGS */
 
-    *s->gen_opc_ptr++ = INDEX_op_call;
-    nparam = s->gen_opparam_ptr++;
+    pi_first = pi = s->gen_next_parm_idx;
     if (ret != TCG_CALL_DUMMY_ARG) {
 #if defined(__sparc__) && !defined(__arch64__) \
     && !defined(CONFIG_TCG_INTERPRETER)
@@ -769,25 +768,25 @@ void tcg_gen_callN(TCGContext *s, void *func, TCGArg ret,
                two return temporaries, and reassemble below.  */
             retl = tcg_temp_new_i64();
             reth = tcg_temp_new_i64();
-            *s->gen_opparam_ptr++ = GET_TCGV_I64(reth);
-            *s->gen_opparam_ptr++ = GET_TCGV_I64(retl);
+            s->gen_opparam_buf[pi++] = GET_TCGV_I64(reth);
+            s->gen_opparam_buf[pi++] = GET_TCGV_I64(retl);
             nb_rets = 2;
         } else {
-            *s->gen_opparam_ptr++ = ret;
+            s->gen_opparam_buf[pi++] = ret;
             nb_rets = 1;
         }
 #else
         if (TCG_TARGET_REG_BITS < 64 && (sizemask & 1)) {
 #ifdef HOST_WORDS_BIGENDIAN
-            *s->gen_opparam_ptr++ = ret + 1;
-            *s->gen_opparam_ptr++ = ret;
+            s->gen_opparam_buf[pi++] = ret + 1;
+            s->gen_opparam_buf[pi++] = ret;
 #else
-            *s->gen_opparam_ptr++ = ret;
-            *s->gen_opparam_ptr++ = ret + 1;
+            s->gen_opparam_buf[pi++] = ret;
+            s->gen_opparam_buf[pi++] = ret + 1;
 #endif
             nb_rets = 2;
         } else {
-            *s->gen_opparam_ptr++ = ret;
+            s->gen_opparam_buf[pi++] = ret;
             nb_rets = 1;
         }
 #endif
@@ -801,7 +800,7 @@ void tcg_gen_callN(TCGContext *s, void *func, TCGArg ret,
 #ifdef TCG_TARGET_CALL_ALIGN_ARGS
             /* some targets want aligned 64 bit args */
             if (real_args & 1) {
-                *s->gen_opparam_ptr++ = TCG_CALL_DUMMY_ARG;
+                s->gen_opparam_buf[pi++] = TCG_CALL_DUMMY_ARG;
                 real_args++;
             }
 #endif
@@ -816,26 +815,42 @@ void tcg_gen_callN(TCGContext *s, void *func, TCGArg ret,
 	       have to get more complicated to differentiate between
 	       stack arguments and register arguments.  */
 #if defined(HOST_WORDS_BIGENDIAN) != defined(TCG_TARGET_STACK_GROWSUP)
-            *s->gen_opparam_ptr++ = args[i] + 1;
-            *s->gen_opparam_ptr++ = args[i];
+            s->gen_opparam_buf[pi++] = args[i] + 1;
+            s->gen_opparam_buf[pi++] = args[i];
 #else
-            *s->gen_opparam_ptr++ = args[i];
-            *s->gen_opparam_ptr++ = args[i] + 1;
+            s->gen_opparam_buf[pi++] = args[i];
+            s->gen_opparam_buf[pi++] = args[i] + 1;
 #endif
             real_args += 2;
             continue;
         }
 
-        *s->gen_opparam_ptr++ = args[i];
+        s->gen_opparam_buf[pi++] = args[i];
         real_args++;
     }
-    *s->gen_opparam_ptr++ = (uintptr_t)func;
-    *s->gen_opparam_ptr++ = flags;
+    s->gen_opparam_buf[pi++] = (uintptr_t)func;
+    s->gen_opparam_buf[pi++] = flags;
+
+    i = s->gen_next_op_idx;
+    tcg_debug_assert(i < OPC_BUF_SIZE);
+    tcg_debug_assert(pi <= OPPARAM_BUF_SIZE);
+
+    /* Set links for sequential allocation during translation.  */
+    s->gen_op_buf[i] = (TCGOp){
+        .opc = INDEX_op_call,
+        .callo = nb_rets,
+        .calli = real_args,
+        .args = pi_first,
+        .prev = i - 1,
+        .next = i + 1
+    };
 
-    *nparam = (nb_rets << 16) | real_args;
+    /* Make sure the calli field didn't overflow.  */
+    tcg_debug_assert(s->gen_op_buf[i].calli == real_args);
 
-    /* total parameters, needed to go backward in the instruction stream */
-    *s->gen_opparam_ptr++ = 1 + nb_rets + real_args + 3;
+    s->gen_last_op_idx = i;
+    s->gen_next_op_idx = i + 1;
+    s->gen_next_parm_idx = pi;
 
 #if defined(__sparc__) && !defined(__arch64__) \
     && !defined(CONFIG_TCG_INTERPRETER)
@@ -972,20 +987,21 @@ static const char * const ldst_name[] =
 
 void tcg_dump_ops(TCGContext *s)
 {
-    const uint16_t *opc_ptr;
-    const TCGArg *args;
-    TCGArg arg;
-    TCGOpcode c;
-    int i, k, nb_oargs, nb_iargs, nb_cargs, first_insn;
-    const TCGOpDef *def;
     char buf[128];
+    TCGOp *op;
+    int oi;
 
-    first_insn = 1;
-    opc_ptr = s->gen_opc_buf;
-    args = s->gen_opparam_buf;
-    while (opc_ptr < s->gen_opc_ptr) {
-        c = *opc_ptr++;
+    for (oi = s->gen_first_op_idx; oi >= 0; oi = op->next) {
+        int i, k, nb_oargs, nb_iargs, nb_cargs;
+        const TCGOpDef *def;
+        const TCGArg *args;
+        TCGOpcode c;
+
+        op = &s->gen_op_buf[oi];
+        c = op->opc;
         def = &tcg_op_defs[c];
+        args = &s->gen_opparam_buf[op->args];
+
         if (c == INDEX_op_debug_insn_start) {
             uint64_t pc;
 #if TARGET_LONG_BITS > TCG_TARGET_REG_BITS
@@ -993,21 +1009,14 @@ void tcg_dump_ops(TCGContext *s)
 #else
             pc = args[0];
 #endif
-            if (!first_insn) {
+            if (oi != s->gen_first_op_idx) {
                 qemu_log("\n");
             }
             qemu_log(" ---- 0x%" PRIx64, pc);
-            first_insn = 0;
-            nb_oargs = def->nb_oargs;
-            nb_iargs = def->nb_iargs;
-            nb_cargs = def->nb_cargs;
         } else if (c == INDEX_op_call) {
-            TCGArg arg;
-
             /* variable number of arguments */
-            arg = *args++;
-            nb_oargs = arg >> 16;
-            nb_iargs = arg & 0xffff;
+            nb_oargs = op->callo;
+            nb_iargs = op->calli;
             nb_cargs = def->nb_cargs;
 
             /* function name, flags, out args */
@@ -1028,26 +1037,20 @@ void tcg_dump_ops(TCGContext *s)
             }
         } else {
             qemu_log(" %s ", def->name);
-            if (c == INDEX_op_nopn) {
-                /* variable number of arguments */
-                nb_cargs = *args;
-                nb_oargs = 0;
-                nb_iargs = 0;
-            } else {
-                nb_oargs = def->nb_oargs;
-                nb_iargs = def->nb_iargs;
-                nb_cargs = def->nb_cargs;
-            }
-            
+
+            nb_oargs = def->nb_oargs;
+            nb_iargs = def->nb_iargs;
+            nb_cargs = def->nb_cargs;
+
             k = 0;
-            for(i = 0; i < nb_oargs; i++) {
+            for (i = 0; i < nb_oargs; i++) {
                 if (k != 0) {
                     qemu_log(",");
                 }
                 qemu_log("%s", tcg_get_arg_str_idx(s, buf, sizeof(buf),
                                                    args[k++]));
             }
-            for(i = 0; i < nb_iargs; i++) {
+            for (i = 0; i < nb_iargs; i++) {
                 if (k != 0) {
                     qemu_log(",");
                 }
@@ -1085,16 +1088,14 @@ void tcg_dump_ops(TCGContext *s)
                 i = 0;
                 break;
             }
-            for(; i < nb_cargs; i++) {
+            for (; i < nb_cargs; i++) {
                 if (k != 0) {
                     qemu_log(",");
                 }
-                arg = args[k++];
-                qemu_log("$0x%" TCG_PRIlx, arg);
+                qemu_log("$0x%" TCG_PRIlx, args[k++]);
             }
         }
         qemu_log("\n");
-        args += nb_iargs + nb_oargs + nb_cargs;
     }
 }
 
@@ -1244,20 +1245,6 @@ void tcg_add_target_add_op_defs(const TCGTargetOpDef *tdefs)
 }
 
 #ifdef USE_LIVENESS_ANALYSIS
-
-/* set a nop for an operation using 'nb_args' */
-static inline void tcg_set_nop(TCGContext *s, uint16_t *opc_ptr, 
-                               TCGArg *args, int nb_args)
-{
-    if (nb_args == 0) {
-        *opc_ptr = INDEX_op_nop;
-    } else {
-        *opc_ptr = INDEX_op_nopn;
-        args[0] = nb_args;
-        args[nb_args - 1] = nb_args;
-    }
-}
-
 /* liveness analysis: end of function: all temps are dead, and globals
    should be in memory. */
 static inline void tcg_la_func_end(TCGContext *s, uint8_t *dead_temps,
@@ -1287,19 +1274,10 @@ static inline void tcg_la_bb_end(TCGContext *s, uint8_t *dead_temps,
    temporaries are removed. */
 static void tcg_liveness_analysis(TCGContext *s)
 {
-    int i, op_index, nb_args, nb_iargs, nb_oargs, nb_ops;
-    TCGOpcode op, op_new, op_new2;
-    TCGArg *args, arg;
-    const TCGOpDef *def;
     uint8_t *dead_temps, *mem_temps;
-    uint16_t dead_args;
-    uint8_t sync_args;
-    bool have_op_new2;
-    
-    s->gen_opc_ptr++; /* skip end */
-
-    nb_ops = s->gen_opc_ptr - s->gen_opc_buf;
+    int oi, oi_prev, nb_ops;
 
+    nb_ops = s->gen_next_op_idx;
     s->op_dead_args = tcg_malloc(nb_ops * sizeof(uint16_t));
     s->op_sync_args = tcg_malloc(nb_ops * sizeof(uint8_t));
     
@@ -1307,25 +1285,31 @@ static void tcg_liveness_analysis(TCGContext *s)
     mem_temps = tcg_malloc(s->nb_temps);
     tcg_la_func_end(s, dead_temps, mem_temps);
 
-    args = s->gen_opparam_ptr;
-    op_index = nb_ops - 1;
-    while (op_index >= 0) {
-        op = s->gen_opc_buf[op_index];
-        def = &tcg_op_defs[op];
-        switch(op) {
+    for (oi = s->gen_last_op_idx; oi >= 0; oi = oi_prev) {
+        int i, nb_iargs, nb_oargs;
+        TCGOpcode opc_new, opc_new2;
+        bool have_opc_new2;
+        uint16_t dead_args;
+        uint8_t sync_args;
+        TCGArg arg;
+
+        TCGOp * const op = &s->gen_op_buf[oi];
+        TCGArg * const args = &s->gen_opparam_buf[op->args];
+        TCGOpcode opc = op->opc;
+        const TCGOpDef *def = &tcg_op_defs[opc];
+
+        oi_prev = op->prev;
+
+        switch (opc) {
         case INDEX_op_call:
             {
                 int call_flags;
 
-                nb_args = args[-1];
-                args -= nb_args;
-                arg = *args++;
-                nb_iargs = arg & 0xffff;
-                nb_oargs = arg >> 16;
+                nb_oargs = op->callo;
+                nb_iargs = op->calli;
                 call_flags = args[nb_oargs + nb_iargs + 1];
 
-                /* pure functions can be removed if their result is not
-                   used */
+                /* pure functions can be removed if their result is unused */
                 if (call_flags & TCG_CALL_NO_SIDE_EFFECTS) {
                     for (i = 0; i < nb_oargs; i++) {
                         arg = args[i];
@@ -1333,8 +1317,7 @@ static void tcg_liveness_analysis(TCGContext *s)
                             goto do_not_remove_call;
                         }
                     }
-                    tcg_set_nop(s, s->gen_opc_buf + op_index,
-                                args - 1, nb_args);
+                    goto do_remove;
                 } else {
                 do_not_remove_call:
 
@@ -1373,41 +1356,33 @@ static void tcg_liveness_analysis(TCGContext *s)
                             dead_temps[arg] = 0;
                         }
                     }
-                    s->op_dead_args[op_index] = dead_args;
-                    s->op_sync_args[op_index] = sync_args;
+                    s->op_dead_args[oi] = dead_args;
+                    s->op_sync_args[oi] = sync_args;
                 }
-                args--;
             }
             break;
         case INDEX_op_debug_insn_start:
-            args -= def->nb_args;
-            break;
-        case INDEX_op_nopn:
-            nb_args = args[-1];
-            args -= nb_args;
+        case INDEX_op_nop:
+        case INDEX_op_end:
             break;
         case INDEX_op_discard:
-            args--;
             /* mark the temporary as dead */
             dead_temps[args[0]] = 1;
             mem_temps[args[0]] = 0;
             break;
-        case INDEX_op_end:
-            break;
 
         case INDEX_op_add2_i32:
-            op_new = INDEX_op_add_i32;
+            opc_new = INDEX_op_add_i32;
             goto do_addsub2;
         case INDEX_op_sub2_i32:
-            op_new = INDEX_op_sub_i32;
+            opc_new = INDEX_op_sub_i32;
             goto do_addsub2;
         case INDEX_op_add2_i64:
-            op_new = INDEX_op_add_i64;
+            opc_new = INDEX_op_add_i64;
             goto do_addsub2;
         case INDEX_op_sub2_i64:
-            op_new = INDEX_op_sub_i64;
+            opc_new = INDEX_op_sub_i64;
         do_addsub2:
-            args -= 6;
             nb_iargs = 4;
             nb_oargs = 2;
             /* Test if the high part of the operation is dead, but not
@@ -1418,12 +1393,11 @@ static void tcg_liveness_analysis(TCGContext *s)
                 if (dead_temps[args[0]] && !mem_temps[args[0]]) {
                     goto do_remove;
                 }
-                /* Create the single operation plus nop.  */
-                s->gen_opc_buf[op_index] = op = op_new;
+                /* Replace the opcode and adjust the args in place,
+                   leaving 3 unused args at the end.  */
+                op->opc = opc = opc_new;
                 args[1] = args[2];
                 args[2] = args[4];
-                assert(s->gen_opc_buf[op_index + 1] == INDEX_op_nop);
-                tcg_set_nop(s, s->gen_opc_buf + op_index + 1, args + 3, 3);
                 /* Fall through and mark the single-word operation live.  */
                 nb_iargs = 2;
                 nb_oargs = 1;
@@ -1431,27 +1405,26 @@ static void tcg_liveness_analysis(TCGContext *s)
             goto do_not_remove;
 
         case INDEX_op_mulu2_i32:
-            op_new = INDEX_op_mul_i32;
-            op_new2 = INDEX_op_muluh_i32;
-            have_op_new2 = TCG_TARGET_HAS_muluh_i32;
+            opc_new = INDEX_op_mul_i32;
+            opc_new2 = INDEX_op_muluh_i32;
+            have_opc_new2 = TCG_TARGET_HAS_muluh_i32;
             goto do_mul2;
         case INDEX_op_muls2_i32:
-            op_new = INDEX_op_mul_i32;
-            op_new2 = INDEX_op_mulsh_i32;
-            have_op_new2 = TCG_TARGET_HAS_mulsh_i32;
+            opc_new = INDEX_op_mul_i32;
+            opc_new2 = INDEX_op_mulsh_i32;
+            have_opc_new2 = TCG_TARGET_HAS_mulsh_i32;
             goto do_mul2;
         case INDEX_op_mulu2_i64:
-            op_new = INDEX_op_mul_i64;
-            op_new2 = INDEX_op_muluh_i64;
-            have_op_new2 = TCG_TARGET_HAS_muluh_i64;
+            opc_new = INDEX_op_mul_i64;
+            opc_new2 = INDEX_op_muluh_i64;
+            have_opc_new2 = TCG_TARGET_HAS_muluh_i64;
             goto do_mul2;
         case INDEX_op_muls2_i64:
-            op_new = INDEX_op_mul_i64;
-            op_new2 = INDEX_op_mulsh_i64;
-            have_op_new2 = TCG_TARGET_HAS_mulsh_i64;
+            opc_new = INDEX_op_mul_i64;
+            opc_new2 = INDEX_op_mulsh_i64;
+            have_opc_new2 = TCG_TARGET_HAS_mulsh_i64;
             goto do_mul2;
         do_mul2:
-            args -= 4;
             nb_iargs = 2;
             nb_oargs = 2;
             if (dead_temps[args[1]] && !mem_temps[args[1]]) {
@@ -1460,28 +1433,25 @@ static void tcg_liveness_analysis(TCGContext *s)
                     goto do_remove;
                 }
                 /* The high part of the operation is dead; generate the low. */
-                s->gen_opc_buf[op_index] = op = op_new;
+                op->opc = opc = opc_new;
                 args[1] = args[2];
                 args[2] = args[3];
-            } else if (have_op_new2 && dead_temps[args[0]]
+            } else if (have_opc_new2 && dead_temps[args[0]]
                        && !mem_temps[args[0]]) {
-                /* The low part of the operation is dead; generate the high.  */
-                s->gen_opc_buf[op_index] = op = op_new2;
+                /* The low part of the operation is dead; generate the high. */
+                op->opc = opc = opc_new2;
                 args[0] = args[1];
                 args[1] = args[2];
                 args[2] = args[3];
             } else {
                 goto do_not_remove;
             }
-            assert(s->gen_opc_buf[op_index + 1] == INDEX_op_nop);
-            tcg_set_nop(s, s->gen_opc_buf + op_index + 1, args + 3, 1);
             /* Mark the single-word operation live.  */
             nb_oargs = 1;
             goto do_not_remove;
 
         default:
             /* XXX: optimize by hardcoding common cases (e.g. triadic ops) */
-            args -= def->nb_args;
             nb_iargs = def->nb_iargs;
             nb_oargs = def->nb_oargs;
 
@@ -1489,24 +1459,23 @@ static void tcg_liveness_analysis(TCGContext *s)
                its outputs are dead. We assume that nb_oargs == 0
                implies side effects */
             if (!(def->flags & TCG_OPF_SIDE_EFFECTS) && nb_oargs != 0) {
-                for(i = 0; i < nb_oargs; i++) {
+                for (i = 0; i < nb_oargs; i++) {
                     arg = args[i];
                     if (!dead_temps[arg] || mem_temps[arg]) {
                         goto do_not_remove;
                     }
                 }
             do_remove:
-                tcg_set_nop(s, s->gen_opc_buf + op_index, args, def->nb_args);
+                op->opc = INDEX_op_nop;
 #ifdef CONFIG_PROFILER
                 s->del_op_count++;
 #endif
             } else {
             do_not_remove:
-
                 /* output args are dead */
                 dead_args = 0;
                 sync_args = 0;
-                for(i = 0; i < nb_oargs; i++) {
+                for (i = 0; i < nb_oargs; i++) {
                     arg = args[i];
                     if (dead_temps[arg]) {
                         dead_args |= (1 << i);
@@ -1527,23 +1496,18 @@ static void tcg_liveness_analysis(TCGContext *s)
                 }
 
                 /* input args are live */
-                for(i = nb_oargs; i < nb_oargs + nb_iargs; i++) {
+                for (i = nb_oargs; i < nb_oargs + nb_iargs; i++) {
                     arg = args[i];
                     if (dead_temps[arg]) {
                         dead_args |= (1 << i);
                     }
                     dead_temps[arg] = 0;
                 }
-                s->op_dead_args[op_index] = dead_args;
-                s->op_sync_args[op_index] = sync_args;
+                s->op_dead_args[oi] = dead_args;
+                s->op_sync_args[oi] = sync_args;
             }
             break;
         }
-        op_index--;
-    }
-
-    if (args != s->gen_opparam_buf) {
-        tcg_abort();
     }
 }
 #else
@@ -2110,11 +2074,11 @@ static void tcg_reg_alloc_op(TCGContext *s,
 #define STACK_DIR(x) (x)
 #endif
 
-static int tcg_reg_alloc_call(TCGContext *s, const TCGOpDef *def,
-                              TCGOpcode opc, const TCGArg *args,
-                              uint16_t dead_args, uint8_t sync_args)
+static void tcg_reg_alloc_call(TCGContext *s, int nb_oargs, int nb_iargs,
+                               const TCGArg * const args, uint16_t dead_args,
+                               uint8_t sync_args)
 {
-    int nb_iargs, nb_oargs, flags, nb_regs, i, reg, nb_params;
+    int flags, nb_regs, i, reg;
     TCGArg arg;
     TCGTemp *ts;
     intptr_t stack_offset;
@@ -2123,22 +2087,16 @@ static int tcg_reg_alloc_call(TCGContext *s, const TCGOpDef *def,
     int allocate_args;
     TCGRegSet allocated_regs;
 
-    arg = *args++;
-
-    nb_oargs = arg >> 16;
-    nb_iargs = arg & 0xffff;
-    nb_params = nb_iargs;
-
     func_addr = (tcg_insn_unit *)(intptr_t)args[nb_oargs + nb_iargs];
     flags = args[nb_oargs + nb_iargs + 1];
 
     nb_regs = ARRAY_SIZE(tcg_target_call_iarg_regs);
-    if (nb_regs > nb_params) {
-        nb_regs = nb_params;
+    if (nb_regs > nb_iargs) {
+        nb_regs = nb_iargs;
     }
 
     /* assign stack slots first */
-    call_stack_size = (nb_params - nb_regs) * sizeof(tcg_target_long);
+    call_stack_size = (nb_iargs - nb_regs) * sizeof(tcg_target_long);
     call_stack_size = (call_stack_size + TCG_TARGET_STACK_ALIGN - 1) & 
         ~(TCG_TARGET_STACK_ALIGN - 1);
     allocate_args = (call_stack_size > TCG_STATIC_CALL_ARGS_SIZE);
@@ -2149,7 +2107,7 @@ static int tcg_reg_alloc_call(TCGContext *s, const TCGOpDef *def,
     }
 
     stack_offset = TCG_TARGET_CALL_STACK_OFFSET;
-    for(i = nb_regs; i < nb_params; i++) {
+    for(i = nb_regs; i < nb_iargs; i++) {
         arg = args[nb_oargs + i];
 #ifdef TCG_TARGET_STACK_GROWSUP
         stack_offset -= sizeof(tcg_target_long);
@@ -2256,8 +2214,6 @@ static int tcg_reg_alloc_call(TCGContext *s, const TCGOpDef *def,
             }
         }
     }
-    
-    return nb_iargs + nb_oargs + def->nb_cargs + 1;
 }
 
 #ifdef CONFIG_PROFILER
@@ -2279,10 +2235,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
                                       tcg_insn_unit *gen_code_buf,
                                       long search_pc)
 {
-    TCGOpcode opc;
-    int op_index;
-    const TCGOpDef *def;
-    const TCGArg *args;
+    int oi, oi_next;
 
 #ifdef DEBUG_DISAS
     if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_OP))) {
@@ -2297,8 +2250,7 @@ static inline int tcg_gen_code_common(TCGContext *s,
 #endif
 
 #ifdef USE_TCG_OPTIMIZATIONS
-    s->gen_opparam_ptr =
-        tcg_optimize(s, s->gen_opc_ptr, s->gen_opparam_buf, tcg_op_defs);
+    tcg_optimize(s);
 #endif
 
 #ifdef CONFIG_PROFILER
@@ -2327,42 +2279,31 @@ static inline int tcg_gen_code_common(TCGContext *s,
 
     tcg_out_tb_init(s);
 
-    args = s->gen_opparam_buf;
-    op_index = 0;
+    for (oi = s->gen_first_op_idx; oi >= 0; oi = oi_next) {
+        TCGOp * const op = &s->gen_op_buf[oi];
+        TCGArg * const args = &s->gen_opparam_buf[op->args];
+        TCGOpcode opc = op->opc;
+        const TCGOpDef *def = &tcg_op_defs[opc];
+        uint16_t dead_args = s->op_dead_args[oi];
+        uint8_t sync_args = s->op_sync_args[oi];
 
-    for(;;) {
-        opc = s->gen_opc_buf[op_index];
+        oi_next = op->next;
 #ifdef CONFIG_PROFILER
         tcg_table_op_count[opc]++;
 #endif
-        def = &tcg_op_defs[opc];
-#if 0
-        printf("%s: %d %d %d\n", def->name,
-               def->nb_oargs, def->nb_iargs, def->nb_cargs);
-        //        dump_regs(s);
-#endif
-        switch(opc) {
+
+        switch (opc) {
         case INDEX_op_mov_i32:
         case INDEX_op_mov_i64:
-            tcg_reg_alloc_mov(s, def, args, s->op_dead_args[op_index],
-                              s->op_sync_args[op_index]);
+            tcg_reg_alloc_mov(s, def, args, dead_args, sync_args);
             break;
         case INDEX_op_movi_i32:
         case INDEX_op_movi_i64:
-            tcg_reg_alloc_movi(s, args, s->op_dead_args[op_index],
-                               s->op_sync_args[op_index]);
+            tcg_reg_alloc_movi(s, args, dead_args, sync_args);
             break;
         case INDEX_op_debug_insn_start:
-            /* debug instruction */
-            break;
         case INDEX_op_nop:
-        case INDEX_op_nop1:
-        case INDEX_op_nop2:
-        case INDEX_op_nop3:
             break;
-        case INDEX_op_nopn:
-            args += args[0];
-            goto next;
         case INDEX_op_discard:
             temp_dead(s, args[0]);
             break;
@@ -2371,12 +2312,9 @@ static inline int tcg_gen_code_common(TCGContext *s,
             tcg_out_label(s, args[0], s->code_ptr);
             break;
         case INDEX_op_call:
-            args += tcg_reg_alloc_call(s, def, opc, args,
-                                       s->op_dead_args[op_index],
-                                       s->op_sync_args[op_index]);
-            goto next;
-        case INDEX_op_end:
-            goto the_end;
+            tcg_reg_alloc_call(s, op->callo, op->calli, args,
+                               dead_args, sync_args);
+            break;
         default:
             /* Sanity check that we've not introduced any unhandled opcodes. */
             if (def->flags & TCG_OPF_NOT_PRESENT) {
@@ -2385,21 +2323,17 @@ static inline int tcg_gen_code_common(TCGContext *s,
             /* Note: in order to speed up the code, it would be much
                faster to have specialized register allocator functions for
                some common argument patterns */
-            tcg_reg_alloc_op(s, def, opc, args, s->op_dead_args[op_index],
-                             s->op_sync_args[op_index]);
+            tcg_reg_alloc_op(s, def, opc, args, dead_args, sync_args);
             break;
         }
-        args += def->nb_args;
-    next:
         if (search_pc >= 0 && search_pc < tcg_current_code_size(s)) {
-            return op_index;
+            return oi;
         }
-        op_index++;
 #ifndef NDEBUG
         check_regs(s);
 #endif
     }
- the_end:
+
     /* Generate TB finalization at the end of block */
     tcg_out_tb_finalize(s);
     return -1;
@@ -2410,14 +2344,18 @@ int tcg_gen_code(TCGContext *s, tcg_insn_unit *gen_code_buf)
 #ifdef CONFIG_PROFILER
     {
         int n;
-        n = (s->gen_opc_ptr - s->gen_opc_buf);
+
+        n = s->gen_last_op_idx + 1;
         s->op_count += n;
-        if (n > s->op_count_max)
+        if (n > s->op_count_max) {
             s->op_count_max = n;
+        }
 
-        s->temp_count += s->nb_temps;
-        if (s->nb_temps > s->temp_count_max)
-            s->temp_count_max = s->nb_temps;
+        n = s->nb_temps;
+        s->temp_count += n;
+        if (n > s->temp_count_max) {
+            s->temp_count_max = n;
+        }
     }
 #endif
 
diff --git a/tcg/tcg.h b/tcg/tcg.h
index ff94931..c9bcac6 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -448,10 +448,28 @@ typedef struct TCGTempSet {
     unsigned long l[BITS_TO_LONGS(TCG_MAX_TEMPS)];
 } TCGTempSet;
 
+typedef struct TCGOp {
+    TCGOpcode opc   : 8;
+
+    /* The number of out and in parameter for a call.  */
+    unsigned callo  : 2;
+    unsigned calli  : 6;
+
+    /* Index of the arguments for this op, or -1 for zero-operand ops.  */
+    signed args     : 16;
+
+    /* Index of the prex/next op, or -1 for the end of the list.  */
+    signed prev     : 16;
+    signed next     : 16;
+} TCGOp;
+
+QEMU_BUILD_BUG_ON(NB_OPS > 0xff);
+QEMU_BUILD_BUG_ON(OPC_BUF_SIZE >= 0x7fff);
+QEMU_BUILD_BUG_ON(OPPARAM_BUF_SIZE >= 0x7fff);
+
 struct TCGContext {
     uint8_t *pool_cur, *pool_end;
     TCGPool *pool_first, *pool_current, *pool_first_large;
-    TCGLabel *labels;
     int nb_labels;
     int nb_globals;
     int nb_temps;
@@ -469,9 +487,6 @@ struct TCGContext {
                                corresponding output argument needs to be
                                sync to memory. */
     
-    /* tells in which temporary a given register is. It does not take
-       into account fixed registers */
-    int reg_to_temp[TCG_TARGET_NB_REGS];
     TCGRegSet reserved_regs;
     intptr_t current_frame_offset;
     intptr_t frame_start;
@@ -479,8 +494,6 @@ struct TCGContext {
     int frame_reg;
 
     tcg_insn_unit *code_ptr;
-    TCGTemp temps[TCG_MAX_TEMPS]; /* globals first, temps after */
-    TCGTempSet free_temps[TCG_TYPE_COUNT * 2];
 
     GHashTable *helpers;
 
@@ -508,14 +521,10 @@ struct TCGContext {
     int goto_tb_issue_mask;
 #endif
 
-    uint16_t gen_opc_buf[OPC_BUF_SIZE];
-    TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
-
-    uint16_t *gen_opc_ptr;
-    TCGArg *gen_opparam_ptr;
-    target_ulong gen_opc_pc[OPC_BUF_SIZE];
-    uint16_t gen_opc_icount[OPC_BUF_SIZE];
-    uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
+    int gen_first_op_idx;
+    int gen_last_op_idx;
+    int gen_next_op_idx;
+    int gen_next_parm_idx;
 
     /* Code generation.  Note that we specifically do not use tcg_insn_unit
        here, because there's too much arithmetic throughout that relies
@@ -533,6 +542,22 @@ struct TCGContext {
 
     /* The TCGBackendData structure is private to tcg-target.c.  */
     struct TCGBackendData *be;
+
+    TCGTempSet free_temps[TCG_TYPE_COUNT * 2];
+    TCGTemp temps[TCG_MAX_TEMPS]; /* globals first, temps after */
+
+    /* tells in which temporary a given register is. It does not take
+       into account fixed registers */
+    int reg_to_temp[TCG_TARGET_NB_REGS];
+
+    TCGOp gen_op_buf[OPC_BUF_SIZE];
+    TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
+
+    target_ulong gen_opc_pc[OPC_BUF_SIZE];
+    uint16_t gen_opc_icount[OPC_BUF_SIZE];
+    uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
+
+    TCGLabel labels[TCG_MAX_LABELS];
 };
 
 extern TCGContext tcg_ctx;
@@ -540,7 +565,7 @@ extern TCGContext tcg_ctx;
 /* The number of opcodes emitted so far.  */
 static inline int tcg_op_buf_count(void)
 {
-    return tcg_ctx.gen_opc_ptr - tcg_ctx.gen_opc_buf;
+    return tcg_ctx.gen_next_op_idx;
 }
 
 /* Test for whether to terminate the TB for using too many opcodes.  */
@@ -717,8 +742,7 @@ void tcg_add_target_add_op_defs(const TCGTargetOpDef *tdefs);
 void tcg_gen_callN(TCGContext *s, void *func,
                    TCGArg ret, int nargs, TCGArg *args);
 
-TCGArg *tcg_optimize(TCGContext *s, uint16_t *tcg_opc_ptr, TCGArg *args,
-                     TCGOpDef *tcg_op_def);
+void tcg_optimize(TCGContext *s);
 
 /* only used for debugging purposes */
 void tcg_dump_ops(TCGContext *s);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2.3 6/8] tcg: Remove opcodes instead of noping them out
  2014-11-11 16:24 [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Richard Henderson
                   ` (4 preceding siblings ...)
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 5/8] tcg: Put opcodes in a linked list Richard Henderson
@ 2014-11-11 16:24 ` Richard Henderson
  2014-11-14 15:08   ` Bastian Koppelmann
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 7/8] tcg: Implement insert_op_before Richard Henderson
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Richard Henderson @ 2014-11-11 16:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: aurelien

With the linked list scheme we need not leave nops in the stream
that we need to process later.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/optimize.c | 14 +++++++-------
 tcg/tcg.c      | 28 ++++++++++++++++++++++++----
 tcg/tcg.h      |  1 +
 3 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index f2b8acf..973fbb4 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -758,7 +758,7 @@ static void tcg_constant_folding(TCGContext *s)
             break;
         do_mov3:
             if (temps_are_copies(args[0], args[1])) {
-                op->opc = INDEX_op_nop;
+                tcg_op_remove(s, op);
             } else {
                 tcg_opt_gen_mov(s, op, args, opc, args[0], args[1]);
             }
@@ -916,7 +916,7 @@ static void tcg_constant_folding(TCGContext *s)
         if (affected == 0) {
             assert(nb_oargs == 1);
             if (temps_are_copies(args[0], args[1])) {
-                op->opc = INDEX_op_nop;
+                tcg_op_remove(s, op);
             } else if (temps[args[1]].state != TCG_TEMP_CONST) {
                 tcg_opt_gen_mov(s, op, args, opc, args[0], args[1]);
             } else {
@@ -948,7 +948,7 @@ static void tcg_constant_folding(TCGContext *s)
         CASE_OP_32_64(and):
             if (temps_are_copies(args[1], args[2])) {
                 if (temps_are_copies(args[0], args[1])) {
-                    op->opc = INDEX_op_nop;
+                    tcg_op_remove(s, op);
                 } else {
                     tcg_opt_gen_mov(s, op, args, opc, args[0], args[1]);
                 }
@@ -979,7 +979,7 @@ static void tcg_constant_folding(TCGContext *s)
         switch (opc) {
         CASE_OP_32_64(mov):
             if (temps_are_copies(args[0], args[1])) {
-                op->opc = INDEX_op_nop;
+                tcg_op_remove(s, op);
                 break;
             }
             if (temps[args[1]].state != TCG_TEMP_CONST) {
@@ -1074,7 +1074,7 @@ static void tcg_constant_folding(TCGContext *s)
                     op->opc = INDEX_op_br;
                     args[0] = args[3];
                 } else {
-                    op->opc = INDEX_op_nop;
+                    tcg_op_remove(s, op);
                 }
                 break;
             }
@@ -1084,7 +1084,7 @@ static void tcg_constant_folding(TCGContext *s)
             tmp = do_constant_folding_cond(opc, args[1], args[2], args[5]);
             if (tmp != 2) {
                 if (temps_are_copies(args[0], args[4-tmp])) {
-                    op->opc = INDEX_op_nop;
+                    tcg_op_remove(s, op);
                 } else if (temps[args[4-tmp]].state == TCG_TEMP_CONST) {
                     tcg_opt_gen_movi(s, op, args, opc,
                                      args[0], temps[args[4-tmp]].val);
@@ -1177,7 +1177,7 @@ static void tcg_constant_folding(TCGContext *s)
                     args[0] = args[5];
                 } else {
             do_brcond_false:
-                    op->opc = INDEX_op_nop;
+                    tcg_op_remove(s, op);
                 }
             } else if ((args[4] == TCG_COND_LT || args[4] == TCG_COND_GE)
                        && temps[args[2]].state == TCG_TEMP_CONST
diff --git a/tcg/tcg.c b/tcg/tcg.c
index f5f29d7..50d7af8 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1244,6 +1244,29 @@ void tcg_add_target_add_op_defs(const TCGTargetOpDef *tdefs)
 #endif
 }
 
+void tcg_op_remove(TCGContext *s, TCGOp *op)
+{
+    int next = op->next;
+    int prev = op->prev;
+
+    if (next >= 0) {
+        s->gen_op_buf[next].prev = prev;
+    } else {
+        s->gen_last_op_idx = prev;
+    }
+    if (prev >= 0) {
+        s->gen_op_buf[prev].next = next;
+    } else {
+        s->gen_first_op_idx = next;
+    }
+
+    *op = (TCGOp){ .opc = INDEX_op_nop, .next = -1, .prev = -1 };
+
+#ifdef CONFIG_PROFILER
+    s->del_op_count++;
+#endif
+}
+
 #ifdef USE_LIVENESS_ANALYSIS
 /* liveness analysis: end of function: all temps are dead, and globals
    should be in memory. */
@@ -1466,10 +1489,7 @@ static void tcg_liveness_analysis(TCGContext *s)
                     }
                 }
             do_remove:
-                op->opc = INDEX_op_nop;
-#ifdef CONFIG_PROFILER
-                s->del_op_count++;
-#endif
+                tcg_op_remove(s, op);
             } else {
             do_not_remove:
                 /* output args are dead */
diff --git a/tcg/tcg.h b/tcg/tcg.h
index c9bcac6..86a56b0 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -742,6 +742,7 @@ void tcg_add_target_add_op_defs(const TCGTargetOpDef *tdefs);
 void tcg_gen_callN(TCGContext *s, void *func,
                    TCGArg ret, int nargs, TCGArg *args);
 
+void tcg_op_remove(TCGContext *s, TCGOp *op);
 void tcg_optimize(TCGContext *s);
 
 /* only used for debugging purposes */
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2.3 7/8] tcg: Implement insert_op_before
  2014-11-11 16:24 [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Richard Henderson
                   ` (5 preceding siblings ...)
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 6/8] tcg: Remove opcodes instead of noping them out Richard Henderson
@ 2014-11-11 16:24 ` Richard Henderson
  2014-11-14 15:25   ` Bastian Koppelmann
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 8/8] tcg: Remove unused opcodes Richard Henderson
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 21+ messages in thread
From: Richard Henderson @ 2014-11-11 16:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: aurelien

Rather reserving space in the op stream for optimization,
let the optimizer add ops as necessary.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/optimize.c | 57 +++++++++++++++++++++++++++++++++++----------------------
 tcg/tcg-op.c   | 16 ----------------
 2 files changed, 35 insertions(+), 38 deletions(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 973fbb4..067917c 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -67,6 +67,37 @@ static void reset_temp(TCGArg temp)
     temps[temp].mask = -1;
 }
 
+static TCGOp *insert_op_before(TCGContext *s, TCGOp *old_op,
+                                TCGOpcode opc, int nargs)
+{
+    int oi = s->gen_next_op_idx;
+    int pi = s->gen_next_parm_idx;
+    int prev = old_op->prev;
+    int next = old_op - s->gen_op_buf;
+    TCGOp *new_op;
+
+    tcg_debug_assert(oi < OPC_BUF_SIZE);
+    tcg_debug_assert(pi + nargs <= OPPARAM_BUF_SIZE);
+    s->gen_next_op_idx = oi + 1;
+    s->gen_next_parm_idx = pi + nargs;
+
+    new_op = &s->gen_op_buf[oi];
+    *new_op = (TCGOp){
+        .opc = opc,
+        .args = pi,
+        .prev = prev,
+        .next = next
+    };
+    if (prev >= 0) {
+        s->gen_op_buf[prev].next = oi;
+    } else {
+        s->gen_first_op_idx = oi;
+    }
+    old_op->prev = oi;
+
+    return new_op;
+}
+
 /* Reset all temporaries, given that there are NB_TEMPS of them.  */
 static void reset_all_temps(int nb_temps)
 {
@@ -1108,8 +1139,8 @@ static void tcg_constant_folding(TCGContext *s)
                 uint64_t a = ((uint64_t)ah << 32) | al;
                 uint64_t b = ((uint64_t)bh << 32) | bl;
                 TCGArg rl, rh;
-                TCGOp *op2;
-                TCGArg *args2;
+                TCGOp *op2 = insert_op_before(s, op, INDEX_op_movi_i32, 2);
+                TCGArg *args2 = &s->gen_opparam_buf[op2->args];
 
                 if (opc == INDEX_op_add2_i32) {
                     a += b;
@@ -1117,15 +1148,6 @@ static void tcg_constant_folding(TCGContext *s)
                     a -= b;
                 }
 
-                /* We emit the extra nop when we emit the add2/sub2.  */
-                op2 = &s->gen_op_buf[oi_next];
-                assert(op2->opc == INDEX_op_nop);
-
-                /* But we still have to allocate args for the op.  */
-                op2->args = s->gen_next_parm_idx;
-                s->gen_next_parm_idx += 2;
-                args2 = &s->gen_opparam_buf[op2->args];
-
                 rl = args[0];
                 rh = args[1];
                 tcg_opt_gen_movi(s, op, args, opc, rl, (uint32_t)a);
@@ -1144,17 +1166,8 @@ static void tcg_constant_folding(TCGContext *s)
                 uint32_t b = temps[args[3]].val;
                 uint64_t r = (uint64_t)a * b;
                 TCGArg rl, rh;
-                TCGOp *op2;
-                TCGArg *args2;
-
-                /* We emit the extra nop when we emit the mulu2.  */
-                op2 = &s->gen_op_buf[oi_next];
-                assert(op2->opc == INDEX_op_nop);
-
-                /* But we still have to allocate args for the op.  */
-                op2->args = s->gen_next_parm_idx;
-                s->gen_next_parm_idx += 2;
-                args2 = &s->gen_opparam_buf[op2->args];
+                TCGOp *op2 = insert_op_before(s, op, INDEX_op_movi_i32, 2);
+                TCGArg *args2 = &s->gen_opparam_buf[op2->args];
 
                 rl = args[0];
                 rh = args[1];
diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
index fbd82bd..8de259a 100644
--- a/tcg/tcg-op.c
+++ b/tcg/tcg-op.c
@@ -571,8 +571,6 @@ void tcg_gen_add2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 al,
 {
     if (TCG_TARGET_HAS_add2_i32) {
         tcg_gen_op6_i32(INDEX_op_add2_i32, rl, rh, al, ah, bl, bh);
-        /* Allow the optimizer room to replace add2 with two moves.  */
-        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
     } else {
         TCGv_i64 t0 = tcg_temp_new_i64();
         TCGv_i64 t1 = tcg_temp_new_i64();
@@ -590,8 +588,6 @@ void tcg_gen_sub2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 al,
 {
     if (TCG_TARGET_HAS_sub2_i32) {
         tcg_gen_op6_i32(INDEX_op_sub2_i32, rl, rh, al, ah, bl, bh);
-        /* Allow the optimizer room to replace sub2 with two moves.  */
-        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
     } else {
         TCGv_i64 t0 = tcg_temp_new_i64();
         TCGv_i64 t1 = tcg_temp_new_i64();
@@ -608,8 +604,6 @@ void tcg_gen_mulu2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 arg1, TCGv_i32 arg2)
 {
     if (TCG_TARGET_HAS_mulu2_i32) {
         tcg_gen_op4_i32(INDEX_op_mulu2_i32, rl, rh, arg1, arg2);
-        /* Allow the optimizer room to replace mulu2 with two moves.  */
-        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
     } else if (TCG_TARGET_HAS_muluh_i32) {
         TCGv_i32 t = tcg_temp_new_i32();
         tcg_gen_op3_i32(INDEX_op_mul_i32, t, arg1, arg2);
@@ -632,8 +626,6 @@ void tcg_gen_muls2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 arg1, TCGv_i32 arg2)
 {
     if (TCG_TARGET_HAS_muls2_i32) {
         tcg_gen_op4_i32(INDEX_op_muls2_i32, rl, rh, arg1, arg2);
-        /* Allow the optimizer room to replace muls2 with two moves.  */
-        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
     } else if (TCG_TARGET_HAS_mulsh_i32) {
         TCGv_i32 t = tcg_temp_new_i32();
         tcg_gen_op3_i32(INDEX_op_mul_i32, t, arg1, arg2);
@@ -1648,8 +1640,6 @@ void tcg_gen_add2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 al,
 {
     if (TCG_TARGET_HAS_add2_i64) {
         tcg_gen_op6_i64(INDEX_op_add2_i64, rl, rh, al, ah, bl, bh);
-        /* Allow the optimizer room to replace add2 with two moves.  */
-        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
     } else {
         TCGv_i64 t0 = tcg_temp_new_i64();
         TCGv_i64 t1 = tcg_temp_new_i64();
@@ -1668,8 +1658,6 @@ void tcg_gen_sub2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 al,
 {
     if (TCG_TARGET_HAS_sub2_i64) {
         tcg_gen_op6_i64(INDEX_op_sub2_i64, rl, rh, al, ah, bl, bh);
-        /* Allow the optimizer room to replace sub2 with two moves.  */
-        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
     } else {
         TCGv_i64 t0 = tcg_temp_new_i64();
         TCGv_i64 t1 = tcg_temp_new_i64();
@@ -1687,8 +1675,6 @@ void tcg_gen_mulu2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 arg1, TCGv_i64 arg2)
 {
     if (TCG_TARGET_HAS_mulu2_i64) {
         tcg_gen_op4_i64(INDEX_op_mulu2_i64, rl, rh, arg1, arg2);
-        /* Allow the optimizer room to replace mulu2 with two moves.  */
-        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
     } else if (TCG_TARGET_HAS_muluh_i64) {
         TCGv_i64 t = tcg_temp_new_i64();
         tcg_gen_op3_i64(INDEX_op_mul_i64, t, arg1, arg2);
@@ -1708,8 +1694,6 @@ void tcg_gen_muls2_i64(TCGv_i64 rl, TCGv_i64 rh, TCGv_i64 arg1, TCGv_i64 arg2)
 {
     if (TCG_TARGET_HAS_muls2_i64) {
         tcg_gen_op4_i64(INDEX_op_muls2_i64, rl, rh, arg1, arg2);
-        /* Allow the optimizer room to replace muls2 with two moves.  */
-        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
     } else if (TCG_TARGET_HAS_mulsh_i64) {
         TCGv_i64 t = tcg_temp_new_i64();
         tcg_gen_op3_i64(INDEX_op_mul_i64, t, arg1, arg2);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2.3 8/8] tcg: Remove unused opcodes
  2014-11-11 16:24 [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Richard Henderson
                   ` (6 preceding siblings ...)
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 7/8] tcg: Implement insert_op_before Richard Henderson
@ 2014-11-11 16:24 ` Richard Henderson
  2014-11-14 15:31   ` Bastian Koppelmann
  2014-11-14 18:22 ` [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Bastian Koppelmann
  2015-01-03  8:46 ` Paolo Bonzini
  9 siblings, 1 reply; 21+ messages in thread
From: Richard Henderson @ 2014-11-11 16:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: aurelien

We no longer need INDEX_op_end to terminate the list, nor do we
need 5 forms of nop, since we just remove the TCGOp instead.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 tcg/tcg-opc.h |  9 ---------
 tcg/tcg.c     | 10 ++++------
 tci.c         | 13 -------------
 3 files changed, 4 insertions(+), 28 deletions(-)

diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h
index 042d442..42d0cfe 100644
--- a/tcg/tcg-opc.h
+++ b/tcg/tcg-opc.h
@@ -27,15 +27,6 @@
  */
 
 /* predefined ops */
-DEF(end, 0, 0, 0, TCG_OPF_NOT_PRESENT) /* must be kept first */
-DEF(nop, 0, 0, 0, TCG_OPF_NOT_PRESENT)
-DEF(nop1, 0, 0, 1, TCG_OPF_NOT_PRESENT)
-DEF(nop2, 0, 0, 2, TCG_OPF_NOT_PRESENT)
-DEF(nop3, 0, 0, 3, TCG_OPF_NOT_PRESENT)
-
-/* variable number of parameters */
-DEF(nopn, 0, 0, 1, TCG_OPF_NOT_PRESENT)
-
 DEF(discard, 1, 0, 0, TCG_OPF_NOT_PRESENT)
 DEF(set_label, 0, 0, 1, TCG_OPF_BB_END | TCG_OPF_NOT_PRESENT)
 
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 50d7af8..e73b8c9 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1260,7 +1260,7 @@ void tcg_op_remove(TCGContext *s, TCGOp *op)
         s->gen_first_op_idx = next;
     }
 
-    *op = (TCGOp){ .opc = INDEX_op_nop, .next = -1, .prev = -1 };
+    memset(op, -1, sizeof(*op));
 
 #ifdef CONFIG_PROFILER
     s->del_op_count++;
@@ -1385,8 +1385,6 @@ static void tcg_liveness_analysis(TCGContext *s)
             }
             break;
         case INDEX_op_debug_insn_start:
-        case INDEX_op_nop:
-        case INDEX_op_end:
             break;
         case INDEX_op_discard:
             /* mark the temporary as dead */
@@ -2244,8 +2242,9 @@ static void dump_op_count(void)
 {
     int i;
 
-    for(i = INDEX_op_end; i < NB_OPS; i++) {
-        qemu_log("%s %" PRId64 "\n", tcg_op_defs[i].name, tcg_table_op_count[i]);
+    for (i = 0; i < NB_OPS; i++) {
+        qemu_log("%s %" PRId64 "\n", tcg_op_defs[i].name,
+                 tcg_table_op_count[i]);
     }
 }
 #endif
@@ -2322,7 +2321,6 @@ static inline int tcg_gen_code_common(TCGContext *s,
             tcg_reg_alloc_movi(s, args, dead_args, sync_args);
             break;
         case INDEX_op_debug_insn_start:
-        case INDEX_op_nop:
             break;
         case INDEX_op_discard:
             temp_dead(s, args[0]);
diff --git a/tci.c b/tci.c
index 4711ee4..28292b3 100644
--- a/tci.c
+++ b/tci.c
@@ -506,19 +506,6 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
         tb_ptr += 2;
 
         switch (opc) {
-        case INDEX_op_end:
-        case INDEX_op_nop:
-            break;
-        case INDEX_op_nop1:
-        case INDEX_op_nop2:
-        case INDEX_op_nop3:
-        case INDEX_op_nopn:
-        case INDEX_op_discard:
-            TODO();
-            break;
-        case INDEX_op_set_label:
-            TODO();
-            break;
         case INDEX_op_call:
             t0 = tci_read_ri(&tb_ptr);
 #if TCG_TARGET_REG_BITS == 32
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2.3 3/8] tcg: Move emit of INDEX_op_end into gen_tb_end
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 3/8] tcg: Move emit of INDEX_op_end into gen_tb_end Richard Henderson
@ 2014-11-13 15:57   ` Bastian Koppelmann
  0 siblings, 0 replies; 21+ messages in thread
From: Bastian Koppelmann @ 2014-11-13 15:57 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: aurelien


On 11/11/2014 04:24 PM, Richard Henderson wrote:
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>   include/exec/gen-icount.h     | 2 ++
>   target-alpha/translate.c      | 2 +-
>   target-arm/translate-a64.c    | 1 -
>   target-arm/translate.c        | 1 -
>   target-cris/translate.c       | 2 +-
>   target-i386/translate.c       | 2 +-
>   target-lm32/translate.c       | 2 +-
>   target-m68k/translate.c       | 1 -
>   target-microblaze/translate.c | 2 +-
>   target-mips/translate.c       | 2 +-
>   target-moxie/translate.c      | 2 +-
>   target-openrisc/translate.c   | 2 +-
>   target-ppc/translate.c        | 2 +-
>   target-s390x/translate.c      | 2 +-
>   target-sh4/translate.c        | 2 +-
>   target-sparc/translate.c      | 2 +-
>   target-tricore/translate.c    | 1 -
>   target-unicore32/translate.c  | 1 -
>   target-xtensa/translate.c     | 1 -
>   19 files changed, 14 insertions(+), 18 deletions(-)
I'm quiete happy with that change, since this was really confusing, if 
you were writing a guest, that uses the tcg-fronted.

Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Cheers,
Bastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2.3 4/8] tcg: Introduce tcg_op_buf_count and tcg_op_buf_full
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 4/8] tcg: Introduce tcg_op_buf_count and tcg_op_buf_full Richard Henderson
@ 2014-11-13 16:13   ` Bastian Koppelmann
  0 siblings, 0 replies; 21+ messages in thread
From: Bastian Koppelmann @ 2014-11-13 16:13 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: aurelien


On 11/11/2014 04:24 PM, Richard Henderson wrote:
> The method by which we count the number of ops emitted
> is going to change.  Abstract that away into some inlines.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>   target-alpha/translate.c      | 14 +++++++-------
>   target-arm/translate-a64.c    |  9 +++------
>   target-arm/translate.c        |  9 +++------
>   target-cris/translate.c       | 13 +++++--------
>   target-i386/translate.c       |  9 +++------
>   target-lm32/translate.c       | 14 +++++---------
>   target-m68k/translate.c       |  9 +++------
>   target-microblaze/translate.c | 20 ++++++++------------
>   target-mips/translate.c       |  8 +++-----
>   target-moxie/translate.c      |  8 +++-----
>   target-openrisc/translate.c   | 13 +++++--------
>   target-ppc/translate.c        |  9 +++------
>   target-s390x/translate.c      |  9 +++------
>   target-sh4/translate.c        |  8 +++-----
>   target-sparc/translate.c      |  8 +++-----
>   target-tricore/translate.c    |  4 +---
>   target-unicore32/translate.c  |  9 +++------
>   target-xtensa/translate.c     |  7 +++----
>   tcg/tcg.h                     | 12 ++++++++++++
>   19 files changed, 79 insertions(+), 113 deletions(-)
>
Again, I'm quiete happy with that change, since these functions make it 
really clear what is happening.

Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Cheers,
Bastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2.3 7/8] tcg: Implement insert_op_before
  2014-11-14 15:25   ` Bastian Koppelmann
@ 2014-11-14 14:46     ` Richard Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2014-11-14 14:46 UTC (permalink / raw)
  To: Bastian Koppelmann, qemu-devel; +Cc: aurelien

On 11/14/2014 04:25 PM, Bastian Koppelmann wrote:
> 
> On 11/11/2014 04:24 PM, Richard Henderson wrote:
>> Rather reserving space in the op stream for optimization,
>> let the optimizer add ops as necessary.
>>
>> Signed-off-by: Richard Henderson <rth@twiddle.net>
>> ---
>>   tcg/optimize.c | 57 +++++++++++++++++++++++++++++++++++----------------------
>>   tcg/tcg-op.c   | 16 ----------------
>>   2 files changed, 35 insertions(+), 38 deletions(-)
>>
>> diff --git a/tcg/optimize.c b/tcg/optimize.c
>> index 973fbb4..067917c 100644
>> --- a/tcg/optimize.c
>> +++ b/tcg/optimize.c
>> @@ -67,6 +67,37 @@ static void reset_temp(TCGArg temp)
>>       temps[temp].mask = -1;
>>   }
>>   +static TCGOp *insert_op_before(TCGContext *s, TCGOp *old_op,
>> +                                TCGOpcode opc, int nargs)
>> +{
>> +    int oi = s->gen_next_op_idx;
>> +    int pi = s->gen_next_parm_idx;
>> +    int prev = old_op->prev;
>> +    int next = old_op - s->gen_op_buf;
>> +    TCGOp *new_op;
>> +
>> +    tcg_debug_assert(oi < OPC_BUF_SIZE);
>> +    tcg_debug_assert(pi + nargs <= OPPARAM_BUF_SIZE);
> I thinks it is better to assure these assertion always hold, e.g.
> 
>     if (oi < OPC_BUF_SIZE || pi + nargs <= OPPARAM_BUF_SIZE) {
>         return NULL;
>     }
>     ...
>     TCGOp *op2 = insert_op_before(s, op, INDEX_op_movi_i32, 2);
>     if (op2) {
>         *args2 = &s->gen_opparam_buf[op2->args];
>     }
> 
> Or how do we know they always hold?

For the same reason we don't bother checking during initial generation of the
opcodes.  We simply assume there's enough space.  Not a good answer but...

> All references on tcg_gen_op0 are gone, so lets remove it.

Fair enough.


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2.3 8/8] tcg: Remove unused opcodes
  2014-11-14 15:31   ` Bastian Koppelmann
@ 2014-11-14 14:47     ` Richard Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2014-11-14 14:47 UTC (permalink / raw)
  To: Bastian Koppelmann, qemu-devel; +Cc: aurelien

On 11/14/2014 04:31 PM, Bastian Koppelmann wrote:
> 
> On 11/11/2014 04:24 PM, Richard Henderson wrote:
>> diff --git a/tci.c b/tci.c
>> index 4711ee4..28292b3 100644
>> --- a/tci.c
>> +++ b/tci.c
>> @@ -506,19 +506,6 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t
>> *tb_ptr)
>>           tb_ptr += 2;
>>             switch (opc) {
>> -        case INDEX_op_end:
>> -        case INDEX_op_nop:
>> -            break;
>> -        case INDEX_op_nop1:
>> -        case INDEX_op_nop2:
>> -        case INDEX_op_nop3:
>> -        case INDEX_op_nopn:
>> -        case INDEX_op_discard:
>> -            TODO();
>> -            break;
>> -        case INDEX_op_set_label:
>> -            TODO();
>> -            break;
> Why do you remove the TODO notice for INDEX_op_discard/set_label? Is TCI no
> longer maintained?

Barely.  But these opcodes never reach this far anyway, so the todo is bogus.


r~

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2.3 5/8] tcg: Put opcodes in a linked list
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 5/8] tcg: Put opcodes in a linked list Richard Henderson
@ 2014-11-14 15:03   ` Bastian Koppelmann
  0 siblings, 0 replies; 21+ messages in thread
From: Bastian Koppelmann @ 2014-11-14 15:03 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: aurelien


On 11/11/2014 04:24 PM, Richard Henderson wrote:
> +static void tcg_gen_op_begin(TCGContext *ctx, TCGOpcode opc, int args)
> +{
> +    int oi = ctx->gen_next_op_idx;
> +    int ni = oi + 1;
> +    int pi = oi - 1;
> +
> +    tcg_debug_assert(oi < OPC_BUF_SIZE);
> +    ctx->gen_last_op_idx = oi;
> +    ctx->gen_next_op_idx = ni;
> +
> +    ctx->gen_op_buf[oi] = (TCGOp){
> +        .opc = opc,
> +        .args = args,
> +        .prev = pi,
> +        .next = ni
> +    };
> +}
> +
The name of this function says begin while used at the end of each 
tcg_gen_op. How about tcg_gen_op_list_add?
> @@ -508,14 +521,10 @@ struct TCGContext {
>       int goto_tb_issue_mask;
>   #endif
>   
> -    uint16_t gen_opc_buf[OPC_BUF_SIZE];
> -    TCGArg gen_opparam_buf[OPPARAM_BUF_SIZE];
> -
> -    uint16_t *gen_opc_ptr;
> -    TCGArg *gen_opparam_ptr;
You forgot to remove gen_opc_ptr in the dummy function 
tcg_liveness_analysis, in case USE_LIVENESS_ANALYSIS is not defined.
> -    target_ulong gen_opc_pc[OPC_BUF_SIZE];
> -    uint16_t gen_opc_icount[OPC_BUF_SIZE];
> -    uint8_t gen_opc_instr_start[OPC_BUF_SIZE];
> +    int gen_first_op_idx;
> +    int gen_last_op_idx;
> +    int gen_next_op_idx;
> +    int gen_next_parm_idx;
>
Other than that it looks good to me.

Cheers,
Bastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2.3 6/8] tcg: Remove opcodes instead of noping them out
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 6/8] tcg: Remove opcodes instead of noping them out Richard Henderson
@ 2014-11-14 15:08   ` Bastian Koppelmann
  0 siblings, 0 replies; 21+ messages in thread
From: Bastian Koppelmann @ 2014-11-14 15:08 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: aurelien

On 11/11/2014 04:24 PM, Richard Henderson wrote:
> With the linked list scheme we need not leave nops in the stream
> that we need to process later.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>   tcg/optimize.c | 14 +++++++-------
>   tcg/tcg.c      | 28 ++++++++++++++++++++++++----
>   tcg/tcg.h      |  1 +
>   3 files changed, 32 insertions(+), 11 deletions(-)
>
>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2.3 7/8] tcg: Implement insert_op_before
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 7/8] tcg: Implement insert_op_before Richard Henderson
@ 2014-11-14 15:25   ` Bastian Koppelmann
  2014-11-14 14:46     ` Richard Henderson
  0 siblings, 1 reply; 21+ messages in thread
From: Bastian Koppelmann @ 2014-11-14 15:25 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: aurelien


On 11/11/2014 04:24 PM, Richard Henderson wrote:
> Rather reserving space in the op stream for optimization,
> let the optimizer add ops as necessary.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>   tcg/optimize.c | 57 +++++++++++++++++++++++++++++++++++----------------------
>   tcg/tcg-op.c   | 16 ----------------
>   2 files changed, 35 insertions(+), 38 deletions(-)
>
> diff --git a/tcg/optimize.c b/tcg/optimize.c
> index 973fbb4..067917c 100644
> --- a/tcg/optimize.c
> +++ b/tcg/optimize.c
> @@ -67,6 +67,37 @@ static void reset_temp(TCGArg temp)
>       temps[temp].mask = -1;
>   }
>   
> +static TCGOp *insert_op_before(TCGContext *s, TCGOp *old_op,
> +                                TCGOpcode opc, int nargs)
> +{
> +    int oi = s->gen_next_op_idx;
> +    int pi = s->gen_next_parm_idx;
> +    int prev = old_op->prev;
> +    int next = old_op - s->gen_op_buf;
> +    TCGOp *new_op;
> +
> +    tcg_debug_assert(oi < OPC_BUF_SIZE);
> +    tcg_debug_assert(pi + nargs <= OPPARAM_BUF_SIZE);
I thinks it is better to assure these assertion always hold, e.g.

     if (oi < OPC_BUF_SIZE || pi + nargs <= OPPARAM_BUF_SIZE) {
         return NULL;
     }
     ...
     TCGOp *op2 = insert_op_before(s, op, INDEX_op_movi_i32, 2);
     if (op2) {
         *args2 = &s->gen_opparam_buf[op2->args];
     }

Or how do we know they always hold?

> diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c
> index fbd82bd..8de259a 100644
> --- a/tcg/tcg-op.c
> +++ b/tcg/tcg-op.c
> @@ -571,8 +571,6 @@ void tcg_gen_add2_i32(TCGv_i32 rl, TCGv_i32 rh, TCGv_i32 al,
>   {
>       if (TCG_TARGET_HAS_add2_i32) {
>           tcg_gen_op6_i32(INDEX_op_add2_i32, rl, rh, al, ah, bl, bh);
> -        /* Allow the optimizer room to replace add2 with two moves.  */
> -        tcg_gen_op0(&tcg_ctx, INDEX_op_nop);
>
All references on tcg_gen_op0 are gone, so lets remove it.

Cheers,
Bastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2.3 8/8] tcg: Remove unused opcodes
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 8/8] tcg: Remove unused opcodes Richard Henderson
@ 2014-11-14 15:31   ` Bastian Koppelmann
  2014-11-14 14:47     ` Richard Henderson
  0 siblings, 1 reply; 21+ messages in thread
From: Bastian Koppelmann @ 2014-11-14 15:31 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: aurelien


On 11/11/2014 04:24 PM, Richard Henderson wrote:
> diff --git a/tci.c b/tci.c
> index 4711ee4..28292b3 100644
> --- a/tci.c
> +++ b/tci.c
> @@ -506,19 +506,6 @@ uintptr_t tcg_qemu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
>           tb_ptr += 2;
>   
>           switch (opc) {
> -        case INDEX_op_end:
> -        case INDEX_op_nop:
> -            break;
> -        case INDEX_op_nop1:
> -        case INDEX_op_nop2:
> -        case INDEX_op_nop3:
> -        case INDEX_op_nopn:
> -        case INDEX_op_discard:
> -            TODO();
> -            break;
> -        case INDEX_op_set_label:
> -            TODO();
> -            break;
Why do you remove the TODO notice for INDEX_op_discard/set_label? Is TCI 
no longer maintained?
>           case INDEX_op_call:
>               t0 = tci_read_ri(&tb_ptr);
>   #if TCG_TARGET_REG_BITS == 32
Other than that,

Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Cheers,
Bastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2.3 1/8] tcg: Move some opcode generation functions out of line
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 1/8] tcg: Move some opcode generation functions out of line Richard Henderson
@ 2014-11-14 18:01   ` Bastian Koppelmann
  0 siblings, 0 replies; 21+ messages in thread
From: Bastian Koppelmann @ 2014-11-14 18:01 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: aurelien


On 11/11/2014 04:24 PM, Richard Henderson wrote:
> Some of these functions are really quite large.  We have a number of
> things that ought to be circularly dependent, but we duplicated code
> to break that chain for the inlines.
>
> This saved 25% of the code size of one of the translators I examined.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>   Makefile.target |    2 +-
>   tcg/tcg-op.c    | 1978 +++++++++++++++++++++++++++++++++++++++++++
>   tcg/tcg-op.h    | 2488 ++++++++-----------------------------------------------
>   tcg/tcg.c       |  137 ---
>   tcg/tcg.h       |    3 -
>   5 files changed, 2339 insertions(+), 2269 deletions(-)
>   create mode 100644 tcg/tcg-op.c
>
>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Cheers,
Bastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2.3 2/8] tcg: Reduce ifdefs in tcg-op.c
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 2/8] tcg: Reduce ifdefs in tcg-op.c Richard Henderson
@ 2014-11-14 18:20   ` Bastian Koppelmann
  0 siblings, 0 replies; 21+ messages in thread
From: Bastian Koppelmann @ 2014-11-14 18:20 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: aurelien


On 11/11/2014 04:24 PM, Richard Henderson wrote:
> Almost completely eliminates the ifdefs in this file, improving
> confidence in the lesser used 32-bit builds.
>
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>   tcg/tcg-op.c | 449 +++++++++++++++++++++++++++--------------------------------
>   1 file changed, 207 insertions(+), 242 deletions(-)
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>

Cheers,
Bastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops
  2014-11-11 16:24 [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Richard Henderson
                   ` (7 preceding siblings ...)
  2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 8/8] tcg: Remove unused opcodes Richard Henderson
@ 2014-11-14 18:22 ` Bastian Koppelmann
  2015-01-03  8:46 ` Paolo Bonzini
  9 siblings, 0 replies; 21+ messages in thread
From: Bastian Koppelmann @ 2014-11-14 18:22 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel; +Cc: aurelien


On 11/11/2014 04:24 PM, Richard Henderson wrote:
> Richard Henderson (8):
>    tcg: Move some opcode generation functions out of line
>    tcg: Reduce ifdefs in tcg-op.c
>    tcg: Move emit of INDEX_op_end into gen_tb_end
>    tcg: Introduce tcg_op_buf_count and tcg_op_buf_full
>    tcg: Put opcodes in a linked list
>    tcg: Remove opcodes instead of noping them out
>    tcg: Implement insert_op_before
>    tcg: Remove unused opcodes
>
>   Makefile.target               |    2 +-
>   include/exec/gen-icount.h     |   22 +-
>   target-alpha/translate.c      |   16 +-
>   target-arm/translate-a64.c    |   10 +-
>   target-arm/translate.c        |   10 +-
>   target-cris/translate.c       |   15 +-
>   target-i386/translate.c       |   11 +-
>   target-lm32/translate.c       |   16 +-
>   target-m68k/translate.c       |   10 +-
>   target-microblaze/translate.c |   22 +-
>   target-mips/translate.c       |   10 +-
>   target-moxie/translate.c      |   10 +-
>   target-openrisc/translate.c   |   15 +-
>   target-ppc/translate.c        |   11 +-
>   target-s390x/translate.c      |   11 +-
>   target-sh4/translate.c        |   10 +-
>   target-sparc/translate.c      |   10 +-
>   target-tricore/translate.c    |    5 +-
>   target-unicore32/translate.c  |   10 +-
>   target-xtensa/translate.c     |    8 +-
>   tcg/optimize.c                |  307 +++--
>   tcg/tcg-op.c                  | 1941 ++++++++++++++++++++++++++++++++
>   tcg/tcg-op.h                  | 2488 ++++++-----------------------------------
>   tcg/tcg-opc.h                 |    9 -
>   tcg/tcg.c                     |  535 +++------
>   tcg/tcg.h                     |   72 +-
>   tci.c                         |   13 -
>   27 files changed, 2761 insertions(+), 2838 deletions(-)
>   create mode 100644 tcg/tcg-op.c
>
Richard, doing the review for the tcg changes helped me in my 
understanding on how tcg works. So whenever you have more changes for 
tcg, feel free to CC me.

Cheers,
Bastian

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops
  2014-11-11 16:24 [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Richard Henderson
                   ` (8 preceding siblings ...)
  2014-11-14 18:22 ` [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Bastian Koppelmann
@ 2015-01-03  8:46 ` Paolo Bonzini
  9 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2015-01-03  8:46 UTC (permalink / raw)
  To: Richard Henderson, Aurelien Jarno, qemu-devel



On 11/11/2014 17:24, Richard Henderson wrote:
> Currently tcg ops are simply placed in a buffer in order.  Which is
> fine until we want to actually do something with the opcode stream,
> such as optimize them.  Note the horrible things like call opcodes
> needing their argument count both prefixed and postfixed so that we
> can iterate across the call either forward or backward.
> 
> While I'm changing this, I also move quite a lot of tcg-op.h out of
> line.  There is very little benefit to having most of them be inline,
> since their arguments are extracted from the guest instructions being
> translated, and thus their values are not really predictable.
> 
> I chose a cutoff of one function call.  If a tcg-op.h functionconsists
> of a single function call, inline it, otherwise move it out of line.
> 
> This also removes a bit of boilerplate from each target.
> 
> I haven't been able to measure a performance difference with this
> patch set.  I wouldn't really expect any, as the complexity level
> remains the same.  I simply find the link list significantly more
> maintainable.
> 
> Of course this isn't intended for the upcoming 2.2 release.
> 
> Comments?

Happy new year! :) Are you going to submit this now?

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-01-03  8:46 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-11 16:24 [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Richard Henderson
2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 1/8] tcg: Move some opcode generation functions out of line Richard Henderson
2014-11-14 18:01   ` Bastian Koppelmann
2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 2/8] tcg: Reduce ifdefs in tcg-op.c Richard Henderson
2014-11-14 18:20   ` Bastian Koppelmann
2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 3/8] tcg: Move emit of INDEX_op_end into gen_tb_end Richard Henderson
2014-11-13 15:57   ` Bastian Koppelmann
2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 4/8] tcg: Introduce tcg_op_buf_count and tcg_op_buf_full Richard Henderson
2014-11-13 16:13   ` Bastian Koppelmann
2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 5/8] tcg: Put opcodes in a linked list Richard Henderson
2014-11-14 15:03   ` Bastian Koppelmann
2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 6/8] tcg: Remove opcodes instead of noping them out Richard Henderson
2014-11-14 15:08   ` Bastian Koppelmann
2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 7/8] tcg: Implement insert_op_before Richard Henderson
2014-11-14 15:25   ` Bastian Koppelmann
2014-11-14 14:46     ` Richard Henderson
2014-11-11 16:24 ` [Qemu-devel] [PATCH 2.3 8/8] tcg: Remove unused opcodes Richard Henderson
2014-11-14 15:31   ` Bastian Koppelmann
2014-11-14 14:47     ` Richard Henderson
2014-11-14 18:22 ` [Qemu-devel] [PATCH 2.3 0/8] Linked list for tcg ops Bastian Koppelmann
2015-01-03  8:46 ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.