All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/7] target-mips: support MTTCG feature
@ 2018-01-19 15:56 Aleksandar Markovic
  2018-01-19 15:56 ` [Qemu-devel] [PATCH 1/7] target/mips: compare virtual addresses in LL/SC sequence Aleksandar Markovic
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Aleksandar Markovic @ 2018-01-19 15:56 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Bennée, Aurelien Jarno, Fam Zheng, Gerd Hoffmann,
	Laurent Vivier, Paolo Bonzini, Peter Maydell,
	Philippe Mathieu-Daudé,
	Richard Henderson, Riku Voipio, Yongbok Kim, Aleksandar Markovic,
	Goran Ferenc, Miodrag Dinic, Petar Jovanovic

From: Aleksandar Markovic <aleksandar.markovic@mips.com>

This series introduces MTTCG feature for MIPS targets by adding all
missing bits and pieces, and formally enabling corresponding QEMU
builds to support such configurations.

PATCH ORGANIZATION
==================

The organization of patches is as follows:

  - patches 1 and 2 deal with MIPS' LL/SC instruction emulation
    improvements related to MTTCG. They are based on a previously
    sent patch series by Leon Alrae (this is the last version, v3):
    http://lists.gnu.org/archive/html/qemu-devel/2016-09/msg06870.html

  - patches 3, 4, 5, and 6 deal with locking/synchronization issues
    that surfaced while introducing MTTCG for MIPS. Similar sets of
    patches have been already integrated for some other platforms
    (arm, intel, ppc, sparc).

  - patch 7 just enables QEMU build system to support MTTCG feature
    for MIPS targets.

PERFORMANCE TESTING
===================

Performance testing was performed using atomic_add-bench test program
that tests LL/SC-related functionality in multithread environment. The
observed performance gain was significant.

For the sake of comparison, test case organization mimics the one from
a previously sent patch set:

target-arm: emulate aarch64's LL/SC using cmpxchg helpers
https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg06653.html

-----------------------------------------------------------------------

          atomic_add-bench: 1000000 ops/thread, [0,1] range
                                               
throughput                                  M - MTTCG     N - no MTTCG

 50 +---------+---------+---------+---------+---------+---------+----+
    |                                                                |
    |M                                                               |
 40 +.                                                               +
    |.                                                               |
    |.                                                               |
 30 +.                                                               +
    |.                                                               |
    |.                                                               |
 20 +.                                                               +
    | M                                                              |
    | .                                                              |
 10 +  .M...M.......M.......M.......M.......M.......M.......M.......M+
    |N                                                               |
    | N.N...N.......N.......N.......N.......N.......N.......N.......N|
  0 +---------+---------+---------+---------+---------+---------+----+
    0         10        20        30        40        50        60

                            number of threads

-----------------------------------------------------------------------

          atomic_add-bench: 1000000 ops/thread, [0,2] range
                                               
throughput                                  M - MTTCG     N - no MTTCG

 50 +---------+---------+---------+---------+---------+---------+----+
    |                                                                |
    |M                                                               |
 40 +.                                                               +
    |.                                                               |
    |.                                                               |
 30 + .                                                              +
    | M                                                              |
    | .                                                              |
 20 +  .M...M.......M.......M.......M.......M.......M.......M.......M+
    |                                                                |
    |                                                                |
 10 +                                                                +
    |N                                                               |
    | N.N...N.......N.......N.......N.......N.......N.......N.......N|
  0 +---------+---------+---------+---------+---------+---------+----+
    0         10        20        30        40        50        60

                            number of threads

-----------------------------------------------------------------------

          atomic_add-bench: 1000000 ops/thread, [0,1] range
                                               
throughput                                  M - MTTCG     N - no MTTCG

150 +---------+---------+---------+---------+---------+---------+----+
    |                                                                |
    |                                            ...M...        ....M|
120 +                   ....M.......M........M...       ....M...     +
    |           ....M...                                             |
    |     ..M...                                                     |
 90 +    .                                                           +
    |  .M                                                            |
    | .                                                              |
 60 + M                                                              +
    |.                                                               |
    |M                                                               |
 30 +                                                                +
    |                                                                |
    |NN.N...N.......N.......N.......N.......N.......N.......N.......N|
  0 +---------+---------+---------+---------+---------+---------+----+
    0         10        20        30        40        50        60

                            number of threads

-----------------------------------------------------------------------

          atomic_add-bench: 1000000 ops/thread, [0,2] range
                                               
throughput                                  M - MTTCG     N - no MTTCG

150 +---------+---------+---------+---------+---------+---------+----+
    |                                            ...M.......M.......M|
    |                           ....M...       ..                    |
120 +           ....M.......M...        ....M..                      +
    |     ..M...                                                     |
    |   M.                                                           |
 90 +  .                                                             +
    | .                                                              |
    | .                                                              |
 60 + M                                                              +
    |.                                                               |
    |M                                                               |
 30 +                                                                +
    |                                                                |
    |NN.N...N.......N.......N.......N.......N.......N.......N.......N|
  0 +---------+---------+---------+---------+---------+---------+----+
    0         10        20        30        40        50        60

                            number of threads

-----------------------------------------------------------------------

Numerical data:

Ops
Range-->      1               2              128            1024

# of     no              no              no              no
 thr.    MTTCG  MTTCG    MTTCG  MTTCG    MTTCG  MTTCG    MTTCG  MTTCG

  1      4.95   42.61    4.94   42.27    4.89   42.24    4.85   41.81
  2      1.23   18.41    1.29   25.71    1.33   57.41    1.36   60.34
  4      0.46   11.99    0.48   19.69    0.53   78.98    0.50   95.39
  8      0.18    9.59    0.18   19.11    0.19  104.66    0.20  112.66
 16      0.11   11.19    0.12   19.12    0.12  108.29    0.13  121.90
 24      0.10   10.18    0.09   19.14    0.11  115.53    0.10  127.40
 32      0.11   11.15    0.12   19.36    0.09  120.60    0.10  131.60
 40      0.08   10.47    0.11   20.88    0.12  124.59    0.10  124.74
 48      0.12   11.78    0.13   20.09    0.11  129.24    0.11  137.19
 56      0.14   12.40    0.13   22.13    0.15  124.16    0.15  138.52
 64      0.14   11.08    0.20   21.08    0.18  131.28    0.19  144.84

-----------------------------------------------------------------------

Graphical representation:

 https://i.imgur.com/OtNLpVX.png

-----------------------------------------------------------------------

REGRESSION TESTING
==================

Regression testing was also performed. The main test bed for regression
testing was LTP test suite executed on QEMU-emulated Debian mips64
system.

Some LTP tests (getrusage04, copy_file_range01) that used to fail for
non-MTTCG systems, pass for MTTCG-enabled systems. Also, some LTP tests
(nanosleep01, poll02, pselect01) intermittently fail on both non-MTTCG
and MTTCG configurations, and therefore do not represent valid
regressions.

Emulation by itself did not appear to have any problems while executing
LTP test suite.

QEMU user mode MTTCG-enabled emulation was also tested to some extent.

Aleksandar Markovic (2):
  Revert "target/mips: hold BQL for timer interrupts"
  target/mips: introduce MTTCG-enabled builds

Goran Ferenc (1):
  target/mips: hold BQL in mips_vpe_wake()

Leon Alrae (2):
  target/mips: compare virtual addresses in LL/SC sequence
  target/mips: reimplement SC instruction and use cmpxchg

Miodrag Dinic (2):
  hw/mips_int: hold BQL for all interrupt requests
  hw/mips_cpc: kick a VP when putting it into Run state

 configure               |   3 ++
 hw/mips/mips_int.c      |  12 +++++
 hw/misc/mips_cpc.c      |  17 ++++++-
 linux-user/main.c       |  58 ------------------------
 target/mips/cpu.h       |   9 ++--
 target/mips/helper.c    |   6 +--
 target/mips/helper.h    |   2 -
 target/mips/machine.c   |   7 +--
 target/mips/op_helper.c |  74 +++++++++---------------------
 target/mips/translate.c | 118 ++++++++++++++++--------------------------------
 10 files changed, 100 insertions(+), 206 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-01-29 10:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-19 15:56 [Qemu-devel] [PATCH 0/7] target-mips: support MTTCG feature Aleksandar Markovic
2018-01-19 15:56 ` [Qemu-devel] [PATCH 1/7] target/mips: compare virtual addresses in LL/SC sequence Aleksandar Markovic
2018-01-19 16:29   ` Alex Bennée
2018-01-29 10:30     ` Miodrag Dinic
2018-01-19 15:56 ` [Qemu-devel] [PATCH 2/7] target/mips: reimplement SC instruction and use cmpxchg Aleksandar Markovic
2018-01-19 15:56 ` [Qemu-devel] [PATCH 3/7] Revert "target/mips: hold BQL for timer interrupts" Aleksandar Markovic
2018-01-19 16:48   ` Alex Bennée
2018-01-22 15:18     ` Aleksandar Markovic
2018-01-19 15:56 ` [Qemu-devel] [PATCH 4/7] hw/mips_int: hold BQL for all interrupt requests Aleksandar Markovic
2018-01-19 15:56 ` [Qemu-devel] [PATCH 5/7] target/mips: hold BQL in mips_vpe_wake() Aleksandar Markovic
2018-01-19 15:56 ` [Qemu-devel] [PATCH 6/7] hw/mips_cpc: kick a VP when putting it into Run state Aleksandar Markovic
2018-01-19 16:47   ` Alex Bennée
2018-01-19 15:56 ` [Qemu-devel] [PATCH 7/7] target/mips: introduce MTTCG-enabled builds Aleksandar Markovic

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.