All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/6] Reduce lock contention on TCG hot-path
@ 2016-07-05 16:18 Alex Bennée
  2016-07-05 16:18 ` [Qemu-devel] [PATCH v2 1/6] tcg: Ensure safe tb_jmp_cache lookup out of 'tb_lock' Alex Bennée
                   ` (6 more replies)
  0 siblings, 7 replies; 41+ messages in thread
From: Alex Bennée @ 2016-07-05 16:18 UTC (permalink / raw)
  To: mttcg, qemu-devel, fred.konrad, a.rigo, serge.fdrv, cota,
	bobby.prani, rth
  Cc: mark.burton, pbonzini, jan.kiszka, peter.maydell,
	claudio.fontana, Alex Bennée

Hi,

Well this is the first re-spin of the series posted last week. I've
added a bunch of additional patches to be more aggressive with
avoiding bouncing locks but to be honest the numbers don't seem to
make it worth it.

I think the first 3 patches are ready to take if the TCG maintainers
want to:

    tcg: Ensure safe tb_jmp_cache lookup out of 'tb_lock'
    tcg: set up tb->page_addr before insertion
    tcg: cpu-exec: remove tb_lock from the hot-path

The remaining patches are included for discussion.

I've re-spun the benchmarks with a larger tarball to show the
difference more clearly:

Baseline Run
============

retry.py called with ['./arm-linux-user/qemu-arm', './pigz.armhf', '-c', '-9', 'linux-4.6.3.tar']
Source code is @ pull-target-arm-20160627-162-ged7e184 or heads/misc/docker-linux-user-v4
run 1: ret=0 (PASS), time=32.786249 (1/1)
run 2: ret=0 (PASS), time=32.535492 (2/2)
run 3: ret=0 (PASS), time=33.036394 (3/3)
run 4: ret=0 (PASS), time=33.036447 (4/4)
run 5: ret=0 (PASS), time=33.036706 (5/5)
run 6: ret=0 (PASS), time=33.536869 (6/6)
run 7: ret=0 (PASS), time=33.286681 (7/7)
run 8: ret=0 (PASS), time=35.292143 (8/8)
run 9: ret=0 (PASS), time=33.286727 (9/9)
run 10: ret=0 (PASS), time=32.786092 (10/10)
Results summary:
0: 10 times (100.00%), avg time 33.262 (0.59 varience/0.77 deviation)

Up to and including tcg: cpu-exec: remove tb_lock from the hot-path
===================================================================

Ran command 10 times, 10 passes
retry.py called with ['./arm-linux-user/qemu-arm', './pigz.armhf', '-c', '-9', 'linux-4.6.3.tar']
Source code is @ pull-target-arm-20160627-165-ga6c4538 or heads/misc/docker-linux-user-v4-3-ga6c4538
run 1: ret=0 (PASS), time=29.783023 (1/1)
run 2: ret=0 (PASS), time=29.532725 (2/2)
run 3: ret=0 (PASS), time=29.783066 (3/3)
run 4: ret=0 (PASS), time=29.783209 (4/4)
run 5: ret=0 (PASS), time=29.783338 (5/5)
run 6: ret=0 (PASS), time=30.033726 (6/6)
run 7: ret=0 (PASS), time=32.039076 (7/7)
run 8: ret=0 (PASS), time=29.783116 (8/8)
run 9: ret=0 (PASS), time=30.033237 (9/9)
run 10: ret=0 (PASS), time=30.283845 (10/10)
Results summary:
0: 10 times (100.00%), avg time 30.084 (0.51 varience/0.72 deviation)

The whole series
================

Ran command 10 times, 10 passes
retry.py called with ['./arm-linux-user/qemu-arm', './pigz.armhf', '-c', '-9', 'linux-4.6.3.tar']
Source code is @ pull-target-arm-20160627-168-ge9609f6 or heads/tcg/hot-path-and-misc-cleanups-v2
run 1: ret=0 (PASS), time=29.532766 (1/1)
run 2: ret=0 (PASS), time=29.534664 (2/2)
run 3: ret=0 (PASS), time=29.533659 (3/3)
run 4: ret=0 (PASS), time=29.282399 (4/4)
run 5: ret=0 (PASS), time=30.283774 (5/5)
run 6: ret=0 (PASS), time=30.033609 (6/6)
run 7: ret=0 (PASS), time=30.283790 (7/7)
run 8: ret=0 (PASS), time=29.783237 (8/8)
run 9: ret=0 (PASS), time=30.033356 (9/9)
run 10: ret=0 (PASS), time=32.536344 (10/10)
Results summary:
0: 10 times (100.00%), avg time 30.084 (0.86 varience/0.93 deviation)
Ran command 10 times, 10 passes

I think the variance and deviation calculations are correct now.The
benchmark is run with my retry script:

    https://github.com/stsquad/retry

The command line was:

    retry.py -l pigz.bench -g -n 10 -c -- ./arm-linux-user/qemu-arm \
        ./pigz.armhf -c -9 linux-4.6.3.tar > /dev/null

Alex Bennée (5):
  tcg: set up tb->page_addr before insertion
  tcg: cpu-exec: remove tb_lock from the hot-path
  tcg: cpu-exec: factor out TB patching code
  tcg: introduce tb_lock_recursive()
  tcg: cpu-exec: roll-up tb_find_fast/slow

Sergey Fedorov (1):
  tcg: Ensure safe tb_jmp_cache lookup out of 'tb_lock'

 cpu-exec.c      | 153 +++++++++++++++++++++++++++++++++-----------------------
 tcg/tcg.h       |   1 +
 translate-all.c |  28 ++++++++---
 3 files changed, 114 insertions(+), 68 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2016-07-11 14:27 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-05 16:18 [Qemu-devel] [PATCH v2 0/6] Reduce lock contention on TCG hot-path Alex Bennée
2016-07-05 16:18 ` [Qemu-devel] [PATCH v2 1/6] tcg: Ensure safe tb_jmp_cache lookup out of 'tb_lock' Alex Bennée
2016-07-07 13:52   ` Sergey Fedorov
2016-07-08 14:51   ` Sergey Fedorov
2016-07-05 16:18 ` [Qemu-devel] [PATCH v2 2/6] tcg: set up tb->page_addr before insertion Alex Bennée
2016-07-07 14:08   ` Sergey Fedorov
2016-07-08  9:40     ` Sergey Fedorov
2016-07-05 16:18 ` [Qemu-devel] [PATCH v2 3/6] tcg: cpu-exec: remove tb_lock from the hot-path Alex Bennée
2016-07-07 14:18   ` Sergey Fedorov
2016-07-08 15:50     ` Sergey Fedorov
2016-07-08 17:34     ` Sergey Fedorov
2016-07-08 18:03       ` Alex Bennée
2016-07-08 18:20         ` Sergey Fedorov
2016-07-08 20:09   ` Sergey Fedorov
2016-07-05 16:18 ` [Qemu-devel] [PATCH v2 4/6] tcg: cpu-exec: factor out TB patching code Alex Bennée
2016-07-05 16:18 ` [Qemu-devel] [PATCH v2 5/6] tcg: introduce tb_lock_recursive() Alex Bennée
2016-07-05 16:18 ` [Qemu-devel] [PATCH v2 6/6] tcg: cpu-exec: roll-up tb_find_fast/slow Alex Bennée
2016-07-07 16:44   ` Sergey Fedorov
2016-07-07 16:44     ` [Qemu-devel] [PATCH 1/3] tcg: Introduce mmap_lock_reset() Sergey Fedorov
2016-07-07 16:44     ` [Qemu-devel] [PATCH 2/3] tcg: Introduce tb_lock_locked() Sergey Fedorov
2016-07-07 16:44     ` [Qemu-devel] [PATCH 3/3] tcg: Avoid bouncing tb_lock between tb_gen_code() and tb_add_jump() Sergey Fedorov
2016-07-07 19:36       ` Alex Bennée
2016-07-07 19:46         ` Sergey Fedorov
2016-07-07 20:36           ` Sergey Fedorov
2016-07-07 21:40             ` Alex Bennée
2016-07-08  8:40       ` Paolo Bonzini
2016-07-08 10:25         ` Sergey Fedorov
2016-07-08 11:02           ` Paolo Bonzini
2016-07-08 12:32             ` Sergey Fedorov
2016-07-08 14:07               ` Paolo Bonzini
2016-07-08 19:55                 ` Sergey Fedorov
2016-07-08 20:18                   ` Paolo Bonzini
2016-07-08 20:24                     ` Sergey Fedorov
2016-07-08 20:52                       ` Paolo Bonzini
2016-07-11 13:06                         ` Sergey Fedorov
2016-07-11 14:03                           ` Paolo Bonzini
2016-07-11 14:27                             ` Sergey Fedorov
2016-07-07 16:04 ` [Qemu-devel] [PATCH v2 0/6] Reduce lock contention on TCG hot-path Emilio G. Cota
2016-07-07 16:13   ` Paolo Bonzini
2016-07-07 19:33     ` Alex Bennée
2016-07-07 19:38   ` Alex Bennée

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.