All of lore.kernel.org
 help / color / mirror / Atom feed
From: Richard Henderson <rth@twiddle.net>
To: qemu-devel@nongnu.org
Cc: peter.maydell@linaro.org, "Emilio G. Cota" <cota@braap.org>
Subject: [Qemu-devel] [PULL 22/26] target/aarch64: optimize indirect branches
Date: Mon,  5 Jun 2017 09:52:29 -0700	[thread overview]
Message-ID: <20170605165233.4135-23-rth@twiddle.net> (raw)
In-Reply-To: <20170605165233.4135-1-rth@twiddle.net>

From: "Emilio G. Cota" <cota@braap.org>

Measurements:

[Baseline performance is that before applying this and the previous commit]

-                                    NBench, aarch64-softmmu. Host: Intel i7-4790K @ 4.00GHz

 1.7x +-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                                                                  |
      |   cross                                                                                                          |
 1.6x +cross+jr.................................................####...................................................+-+
      |                                                         #++#                                                     |
      |                                                         #  #                                                     |
 1.5x +-+...................................................*****..#...................................................+-+
      |                                                     *+++*  #                                                     |
      |                                                     *   *  #                                                     |
 1.4x +-+...................................................*...*..#...................................................+-+
      |                                                     *   *  #                                                     |
      |                                     #####           *   *  #                                                     |
 1.3x +-+................................****+++#...........*...*..#...................................................+-+
      |                                  *++*   #           *   *  #                                                     |
      |                                  *  *   #           *   *  #                                                     |
 1.2x +-+................................*..*...#...........*...*..#...................................................+-+
      |                                  *  *   #           *   *  #                                                     |
      |                            ####  *  *   #           *   *  #                                                     |
 1.1x +-+.......................+++#..#..*..*...#...........*...*..#...................................................+-+
      |                         ****  #  *  *   #           *   *  #                                        ****####     |
      |                         *  *  #  *  *   #           *   *  #  ****###   +++####            ****###  *  *   #     |
   1x +-++-++++++-++++****###++-*++*++#++*++*+-+#++****+++++*+++*++#++*++*-+#++*****++#++****###-++*++*-+#++*+-*+++#+-++-+
      |     *****###  *  *  #   *  *  #  *  *   #  *++*###  *   *  #  *  *  #  *   *  #  *  *++#   *  *  #  *  *   #     |
      |     *   *++#  *  *  #   *  *  #  *  *   #  *  *  #  *   *  #  *  *  #  *   *  #  *  *  #   *  *  #  *  *   #     |
 0.9x +-+---*****###--****###---****###--****####--****###--*****###--****###--*****###--****###---****###--****####---+-+
      ASSIGNMENT BITFIELD   FOURFP EMULATION   HUFFMAN   LU DECOMPOSITIONNEURAL NUMERIC SORSTRING SORT    hmean
  png: http://imgur.com/qO9ubtk
NB. cross here represents the previous commit.

-                            SPECint06 (test set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz

 1.5x +-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                       *****                                      |
      |                                                                       *+++*                           jr         |
      |                                                                       *   *                                      |
 1.4x +-+.....................................................................*...*.....................+++............+-+
      |                                                                       *   *                      |               |
      |                                      *****                            *   *                      |               |
      |                                      *   *                            *   *                    *****             |
 1.3x +-+....................................*...*............................*...*....................*.|.*...........+-+
      |                       +++            *   *                            *   *                    * | *             |
      |                      *****           *   *                            *   *                    *+++*             |
      |                      *   *           *   *                            *   *                    *   *             |
 1.2x +-+....................*...*...........*...*............................*...*...........*****....*...*...........+-+
      |     *****            *   *           *   *                            *   *           *   *    *   *    +++      |
      |     *   *            *   *           *   *                            *   *           *   *    *   *   *****     |
      |     *   *            *   *   *****   *   *                            *   *           *   *    *   *   *   *     |
 1.1x +-+...*...*............*...*...*...*...*...*............................*...*....+++....*...*....*...*...*...*...+-+
      |     *   *            *   *   *   *   *   *                            *   *   *****   *   *    *   *   *   *     |
      |     *   *            *   *   *   *   *   *   *****                    *   *   *   *   *   *    *   *   *   *     |
      |     *   *   *****    *   *   *   *   *   *   *   *   ******           *   *   *   *   *   *    *   *   *   *     |
   1x +-++-+*+++*-++*+++*++++*+-+*+++*-++*+++*-++*+++*+++*++-*++++*-++*****+++*++-*+++*++-*+++*+-+*++++*+++*++-*+++*+-++-+
      |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *+++*   *   *   *   *   *   *    *   *   *   *     |
      |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *     |
      |     *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *   *   *   *   *   *    *   *   *   *     |
 0.9x +-+---*****---*****----*****---*****---*****---*****---******---*****---*****---*****---*****----*****---*****---+-+
         astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/3Dp4vvq

-                           SPECint06 (train set), aarch64-linux-user. Host: Intel i7-4790K @ 4.00GHz

 1.7x +-+--------------------------------------------------------------------------------------------------------------+-+
      |                                                                                                                  |
      |                                                                                                       jr         |
 1.6x +-+...............................................................................................+++............+-+
      |                                                                                                *****             |
      |                                                                                                *+++*             |
      |                                                                                                *   *             |
 1.5x +-+..............................................................................................*...*...........+-+
      |                                                                        +++                     *   *             |
      |                                                                       *****                    *   *             |
 1.4x +-+.....................................................................*+++*....................*...*...........+-+
      |                                                                       *   *                    *   *             |
      |                                      *****                            *   *                    *   *             |
      |                                      *   *                            *   *   *****            *   *             |
 1.3x +-+....................................*...*............................*...*...*...*............*...*...........+-+
      |                       +++            *   *                            *   *   *   *            *   *             |
      |                      *****           *   *                            *   *   *   *   *****    *   *             |
 1.2x +-+....................*...*...........*...*............................*...*...*...*...*+++*....*...*...*****...+-+
      |                      *   *           *   *                            *   *   *   *   *   *    *   *   *+++*     |
      |     *****            *   *   *****   *   *                            *   *   *   *   *   *    *   *   *   *     |
      |     *   *            *   *   *+++*   *   *                            *   *   *   *   *   *    *   *   *   *     |
 1.1x +-+...*...*............*...*...*...*...*...*............................*...*...*...*...*...*....*...*...*...*...+-+
      |     *   *   *****    *   *   *   *   *   *                    *****   *   *   *   *   *   *    *   *   *   *     |
      |     *   *   *   *    *   *   *   *   *   *    +++    ******   *+++*   *   *   *   *   *   *    *   *   *   *     |
   1x +-+---*****---*****----*****---*****---*****---*****---******---*****---*****---*****---*****----*****---*****---+-+
         astar   bzip2      gcc   gobmk h264ref   hmmlibquantum      mcf omnetpperlbench   sjengxalancbmk   hmean
  png: http://imgur.com/vRrdc9j

Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 target/arm/translate-a64.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index ab61d96..860e279 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -11367,8 +11367,7 @@ void gen_intermediate_code_a64(ARMCPU *cpu, TranslationBlock *tb)
             gen_a64_set_pc_im(dc->pc);
             /* fall through */
         case DISAS_JUMP:
-            /* indicate that the hash table must be used to find the next TB */
-            tcg_gen_exit_tb(0);
+            tcg_gen_lookup_and_goto_ptr(cpu_pc);
             break;
         case DISAS_TB_JUMP:
         case DISAS_EXC:
-- 
2.9.4

  parent reply	other threads:[~2017-06-05 16:53 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-05 16:52 [Qemu-devel] [PULL 00/26] tcg queued patches Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 01/26] target/nios2: Fix 64-bit ilp32 compilation Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 02/26] tcg/sparc: Use the proper compilation flags for 32-bit Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 03/26] qemu/atomic: Loosen restrictions for 64-bit ILP32 hosts Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 04/26] tcg: Introduce goto_ptr opcode and tcg_gen_lookup_and_goto_ptr Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 05/26] tcg/i386: implement goto_ptr Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 06/26] target/arm: optimize cross-page direct jumps in softmmu Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 07/26] target/arm: optimize indirect branches Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 08/26] target/i386: introduce gen_jr helper to generate lookup_and_goto_ptr Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 09/26] target/i386: optimize cross-page direct jumps in softmmu Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 10/26] target/i386: optimize indirect branches Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 11/26] tb-hash: improve tb_jmp_cache hash function in user mode Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 12/26] tcg/ppc: Implement goto_ptr Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 13/26] tcg/aarch64: " Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 14/26] tcg/sparc: " Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 15/26] tcg/s390: " Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 16/26] tcg/arm: Clarify tcg_out_bx for arm4 host Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 17/26] tcg/arm: Implement goto_ptr Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 19/26] target/s390: Use tcg_gen_lookup_and_goto_ptr Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 20/26] target/hppa: " Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 21/26] target/aarch64: optimize cross-page direct jumps in softmmu Richard Henderson
2017-06-05 16:52 ` Richard Henderson [this message]
2017-06-07 14:11   ` [Qemu-devel] [PULL 22/26] target/aarch64: optimize indirect branches Alex Bennée
2017-06-07 14:22     ` Alex Bennée
2017-06-07 15:19       ` Alex Bennée
2017-06-07 15:52         ` Alex Bennée
2017-06-07 20:22       ` Emilio G. Cota
2017-06-08 10:48         ` Alex Bennée
2017-06-07 20:38     ` Richard Henderson
2017-06-08  8:38       ` Alex Bennée
2017-06-05 16:52 ` [Qemu-devel] [PULL 23/26] target/mips: optimize cross-page direct jumps in softmmu Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 24/26] target/mips: optimize indirect branches Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 25/26] target/alpha: Implement WTINT inline Richard Henderson
2017-06-05 16:52 ` [Qemu-devel] [PULL 26/26] target/alpha: Use goto_tb for fallthru between TBs Richard Henderson
2017-06-06  8:56 ` [Qemu-devel] [PULL 00/26] tcg queued patches Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170605165233.4135-23-rth@twiddle.net \
    --to=rth@twiddle.net \
    --cc=cota@braap.org \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.