linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
To: linux-kernel@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org, benh@kernel.crashing.org,
	paulus@samba.org, mpe@ellerman.id.au, peterz@infradead.org,
	mingo@redhat.com, paulmck@linux.vnet.ibm.com,
	waiman.long@hpe.com, xinhui.pan@linux.vnet.ibm.com,
	virtualization@lists.linux-foundation.org, boqun.feng@gmail.com
Subject: [PATCH v8 0/6] Implement qspinlock/pv-qspinlock on ppc
Date: Mon,  5 Dec 2016 10:19:20 -0500	[thread overview]
Message-ID: <1480951166-44830-1-git-send-email-xinhui.pan@linux.vnet.ibm.com> (raw)

Hi All,
  this is the fairlock patchset. You can apply them and build successfully.
patches are based on linux-next
  qspinlock can avoid waiter starved issue. It has about the same speed in
single-thread and it can be much faster in high contention situations
especially when the spinlock is embedded within the data structure to be
protected.

v7 -> v8:
	add one patch to drop a function call under native qspinlock unlock.
	Enabling qspinlock or not is a complier option now.
	rebase onto linux-next(4.9-rc7)
v6 -> v7:
	rebase onto 4.8-rc4
v1 -> v6:
	too many details. snip. 

some benchmark result below

perf bench
these numbers are ops per sec, So the higher the better.
*******************************************
on pSeries with 32 vcpus, 32Gb memory, pHyp.
------------------------------------------------------------------------------------
	test case		| pv-qspinlock  |  qspinlock 	| current-spinlock
------------------------------------------------------------------------------------
futex hash			| 618572	| 552332	| 553788
futex lock-pi			| 364		| 364		| 364
sched pipe			| 78984		| 76060		| 81454
------------------------------------------------------------------------------------

unix bench:
these numbers are scores, So the higher the better.
************************************************
on PowerNV with 16 cores(cpus) (smt off), 32Gb memory:
-------------
pv-qspinlock and qspinlock have very similar results because pv-qspinlock use native version
which is only having one callback overhead
------------------------------------------------------------------------------------
	test case		| pv-qspinlock and qspinlock | current-spinlock
------------------------------------------------------------------------------------
Execl Throughput                               761.1             761.4
File Copy 1024 bufsize 2000 maxblocks         1259.8            1286.6
File Copy 256 bufsize 500 maxblocks            782.2             790.3
File Copy 4096 bufsize 8000 maxblocks         2741.5            2817.4
Pipe Throughput                               1063.2            1036.7
Pipe-based Context Switching                   284.7             281.1
Process Creation                               679.6             649.1
Shell Scripts (1 concurrent)                  1933.2            1922.9
Shell Scripts (8 concurrent)                  5003.3            4899.8
System Call Overhead                           900.6             896.8
                                             ==========================
System Benchmarks Index Score                 1139.3 	 	 1133.0
--------------------------------------------------------------------------- ---------

*******************************************
on pSeries with 32 vcpus, 32Gb memory, pHyp.
------------------------------------------------------------------------------------
	test case		|	pv-qspinlock |	qspinlock | current-spinlock
------------------------------------------------------------------------------------
Execl Throughput                             877.1         891.2         872.8
File Copy 1024 bufsize 2000 maxblocks       1390.4        1399.2        1395.0
File Copy 256 bufsize 500 maxblocks          882.4         889.5         881.8
File Copy 4096 bufsize 8000 maxblocks       3112.3        3113.4        3121.7
Pipe Throughput                             1095.8        1162.6        1158.5
Pipe-based Context Switching                 194.9         192.7         200.7
Process Creation                             518.4         526.4         509.1
Shell Scripts (1 concurrent)                1401.9        1413.9        1402.2
Shell Scripts (8 concurrent)                3215.6        3246.6        3229.1
System Call Overhead                         833.2         892.4         888.1
                                          ====================================
System Benchmarks Index Score               1033.7        1052.5        1047.8
------------------------------------------------------------------------------------

******************************************
on pSeries with 32 vcpus, 16Gb memory, KVM.
------------------------------------------------------------------------------------
	test case		|	pv-qspinlock |	qspinlock | current-spinlock
------------------------------------------------------------------------------------
Execl Throughput                             497.4        518.7         497.8
File Copy 1024 bufsize 2000 maxblocks       1368.8       1390.1        1343.3
File Copy 256 bufsize 500 maxblocks          857.7        859.8         831.4
File Copy 4096 bufsize 8000 maxblocks       2851.7       2838.1        2785.5
Pipe Throughput                             1221.9       1265.3        1250.4
Pipe-based Context Switching                 529.8        578.1         564.2
Process Creation                             408.4        421.6         287.6
Shell Scripts (1 concurrent)                1201.8       1215.3        1185.8
Shell Scripts (8 concurrent)                3758.4       3799.3        3878.9
System Call Overhead                        1008.3       1122.6        1134.2
                                          =====================================
System Benchmarks Index Score               1072.0       1108.9        1050.6
------------------------------------------------------------------------------------

Pan Xinhui (6):
  powerpc/qspinlock: powerpc support qspinlock
  powerpc: pSeries/Kconfig: Add qspinlock build config
  powerpc: lib/locks.c: Add cpu yield/wake helper function
  powerpc/pv-qspinlock: powerpc support pv-qspinlock
  powerpc: pSeries: Add pv-qspinlock build config/make
  powerpc/pv-qspinlock: Optimise native unlock path

 arch/powerpc/include/asm/qspinlock.h               |  93 ++++++++++++
 arch/powerpc/include/asm/qspinlock_paravirt.h      |  52 +++++++
 .../powerpc/include/asm/qspinlock_paravirt_types.h |  13 ++
 arch/powerpc/include/asm/spinlock.h                |  35 +++--
 arch/powerpc/include/asm/spinlock_types.h          |   4 +
 arch/powerpc/kernel/Makefile                       |   1 +
 arch/powerpc/kernel/paravirt.c                     | 157 +++++++++++++++++++++
 arch/powerpc/lib/locks.c                           | 122 ++++++++++++++++
 arch/powerpc/platforms/pseries/Kconfig             |  16 +++
 arch/powerpc/platforms/pseries/setup.c             |   5 +
 10 files changed, 485 insertions(+), 13 deletions(-)
 create mode 100644 arch/powerpc/include/asm/qspinlock.h
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt.h
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt_types.h
 create mode 100644 arch/powerpc/kernel/paravirt.c

-- 
2.4.11

             reply	other threads:[~2016-12-05 10:23 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-05 15:19 Pan Xinhui [this message]
2016-12-05 15:19 ` [PATCH v8 1/6] powerpc/qspinlock: powerpc support qspinlock Pan Xinhui
2016-12-06  0:47   ` Boqun Feng
2016-12-06  1:16     ` Pan Xinhui
2016-12-05 15:19 ` [PATCH v8 2/6] powerpc: pSeries/Kconfig: Add qspinlock build config Pan Xinhui
2016-12-06  0:58   ` Boqun Feng
2016-12-06  1:24     ` Pan Xinhui
2016-12-06  2:12       ` Pan Xinhui
2016-12-05 15:19 ` [PATCH v8 3/6] powerpc: lib/locks.c: Add cpu yield/wake helper function Pan Xinhui
2016-12-06  1:23   ` Boqun Feng
2016-12-06  1:30     ` Pan Xinhui
2016-12-05 15:19 ` [PATCH v8 4/6] powerpc/pv-qspinlock: powerpc support pv-qspinlock Pan Xinhui
2016-12-05 15:19 ` [PATCH v8 5/6] powerpc: pSeries: Add pv-qspinlock build config/make Pan Xinhui
2016-12-05 15:19 ` [PATCH v8 6/6] powerpc/pv-qspinlock: Optimise native unlock path Pan Xinhui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1480951166-44830-1-git-send-email-xinhui.pan@linux.vnet.ibm.com \
    --to=xinhui.pan@linux.vnet.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=boqun.feng@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=waiman.long@hpe.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).