[Qemu-devel] [PATCH 0/2] Reduce lock contention on TCG hot-path

* [Qemu-devel] [PATCH 0/2] Reduce lock contention on TCG hot-path
@ 2016-07-01 16:16 Alex Bennée
  2016-07-01 16:16 ` [Qemu-devel] [PATCH 1/2] tcg: Ensure safe tb_jmp_cache lookup out of 'tb_lock' Alex Bennée
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Alex Bennée @ 2016-07-01 16:16 UTC (permalink / raw)
  To: mttcg, qemu-devel, fred.konrad, a.rigo, serge.fdrv, cota,
	bobby.prani, rth
  Cc: mark.burton, pbonzini, jan.kiszka, peter.maydell,
	claudio.fontana, Alex Bennée

These patches have been on the list before in my base enabling patches
series [1]. However while looking at some user-space work loads I realised
there is no particular reason to hold them back until the MTTCG work
is complete. I fixed one missing atomic_set in Sergey's patch and
addressed his review comments from the last posting.

For a simple parallel user-mode test it give a ~5% speed boost:

Before:

retry.oy called with ['./arm-linux-user/qemu-arm', './pigz.armhf', '-c', '-9', 'source.tar']
Source code is @ pull-target-arm-20160627-153-g1b756f1 or heads/master
run 1: ret=0 (PASS), time=4.755824 (1/1)
run 2: ret=0 (PASS), time=4.756076 (2/2)
run 3: ret=0 (PASS), time=4.755916 (3/3)
run 4: ret=0 (PASS), time=4.755853 (4/4)
run 5: ret=0 (PASS), time=4.755929 (5/5)
Results summary:
0: 5 times (100.00%), avg time 4.755920 (0.000000 deviation)

After:

retry.py called with ['./arm-linux-user/qemu-arm', './pigz.armhf', '-c', '-9', 'source.tar']
Source code is @ pull-target-arm-20160627-155-g579ffd4 or heads/tcg/hot-path-cleanups
run 1: ret=0 (PASS), time=4.505735 (1/1)
run 2: ret=0 (PASS), time=4.505683 (2/2)
run 3: ret=0 (PASS), time=4.505666 (3/3)
run 4: ret=0 (PASS), time=4.505578 (4/4)
run 5: ret=0 (PASS), time=4.505544 (5/5)
Results summary:
0: 5 times (100.00%), avg time 4.505641 (0.000000 deviation)

For system-mode the change is in the noise despite the fact by
dropping the CONFIG_USER_ONLY specific stuff we will run
tb_find_physical twice on the first lookup for any given block.

Before:

retry.py called with ['/home/alex/lsrc/qemu/qemu.git/arm-softmmu/qemu-system-arm', '-machine', 'type=virt', '-display', 'none', '-smp', '1', '-m', '4096', '-cpu', 'cortex-a15', '-serial', 'telnet:127.0.0.1:4444', '-monitor', 'stdio', '-netdev', 'user,id=unet,hostfwd=tcp::2222-:22', '-device', 'virtio-net-device,netdev=unet', '-drive', 'file=/home/alex/lsrc/qemu/images/jessie-arm32.qcow2,id=myblock,index=0,if=none', '-device', 'virtio-blk-device,drive=myblock', '-append', 'console=ttyAMA0 systemd.unit=benchmark.service root=/dev/vda1', '-kernel', '/home/alex/lsrc/qemu/images/aarch32-current-linux-kernel-only.img']
Source code is @ pull-target-arm-20160627-153-g1b756f1 or heads/master
run 1: ret=0 (PASS), time=10.262175 (1/1)
run 2: ret=0 (PASS), time=10.262821 (2/2)
run 3: ret=0 (PASS), time=9.762559 (3/3)
run 4: ret=0 (PASS), time=9.762108 (4/4)
run 5: ret=0 (PASS), time=10.262576 (5/5)
Results summary:
0: 5 times (100.00%), avg time 10.062448 (0.060046 deviation)
Ran command 5 times, 5 passes

After:

retry.py called with ['/home/alex/lsrc/qemu/qemu.git/arm-softmmu/qemu-system-arm', '-machine', 'type=virt', '-display', 'none', '-smp', '1', '-m', '4096', '-cpu', 'cortex-a15', '-serial', 'telnet:127.0.0.1:4444', '-monitor', 'stdio', '-netdev', 'user,id=unet,hostfwd=tcp::2222-:22', '-device', 'virtio-net-device,netdev=unet', '-drive', 'file=/home/alex/lsrc/qemu/images/jessie-arm32.qcow2,id=myblock,index=0,if=none', '-device', 'virtio-blk-device,drive=myblock', '-append', 'console=ttyAMA0 systemd.unit=benchmark.service root=/dev/vda1', '-kernel', '/home/alex/lsrc/qemu/images/aarch32-current-linux-kernel-only.img']
Source code is @ pull-target-arm-20160627-155-g579ffd4 or heads/tcg/hot-path-cleanups
run 1: ret=0 (PASS), time=9.761559 (1/1)
run 2: ret=0 (PASS), time=9.511616 (2/2)
run 3: ret=0 (PASS), time=9.761713 (3/3)
run 4: ret=0 (PASS), time=10.262504 (4/4)
run 5: ret=0 (PASS), time=9.762059 (5/5)
Results summary:
0: 5 times (100.00%), avg time 9.811890 (0.060150 deviation)
Ran command 5 times, 5 passes

[1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg375023.html

Alex Bennée (1):
  cpu-exec: remove tb_lock from the hot-path

Sergey Fedorov (1):
  tcg: Ensure safe tb_jmp_cache lookup out of 'tb_lock'

 cpu-exec.c      | 58 ++++++++++++++++++++++++++++-----------------------------
 translate-all.c |  7 ++++++-
 2 files changed, 35 insertions(+), 30 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 24+ messages in thread