All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
To: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.orgv,
	virtualization@lists.linux-foundation.org
Cc: benh@kernel.crashing.org, paulus@samba.org, mpe@ellerman.id.au,
	peterz@infradead.org, mingo@redhat.com,
	paulmck@linux.vnet.ibm.com, waiman.long@hpe.com,
	root <root@ltcalpine2-lp13.aus.stglabs.ibm.com>
Subject: [PATCH v5 0/6] powerPC/pSeries use pv-qpsinlock as the default spinlock implemention
Date: Thu,  2 Jun 2016 17:22:43 +0800	[thread overview]
Message-ID: <1464859370-5162-1-git-send-email-xinhui.pan@linux.vnet.ibm.com> (raw)

From: root <root@ltcalpine2-lp13.aus.stglabs.ibm.com>

change from v4:
	BUG FIX. thanks boqun reporting this issue.
	struct  __qspinlock has different layout in bigendian mahcine.
	native_queued_spin_unlock() may write value to a wrong address. now fix it.
	
change from v3:
	a big change in [PATCH v4 4/6] pv-qspinlock: powerpc support pv-qspinlock
	no other patch changed.
	and the patch cover letter tilte has changed as only pseries may need use pv-qspinlock, not all powerpc.

	1) __pv_wait will not return until *ptr != val as Waiman gives me a tip.
	2) support lock holder serching by storing cpu number into a hash table(implemented as an array)
	This is because lock_stealing hit too much, up to 10%~20% of all the successful lock(), and avoid
	vcpu slices bounce.
	
change from v2:
	__spin_yeild_cpu() will yield slices to lpar if target cpu is running.
	remove unnecessary rmb() in __spin_yield/wake_cpu.
	__pv_wait() will check the *ptr == val.
	some commit message change

change fome v1:
	separate into 6 pathes from one patch
	some minor code changes.

I do several tests on pseries IBM,8408-E8E with 32cpus, 64GB memory, kernel 4.6
benchmark test results are below.

2 perf tests:
perf bench futex hash
perf bench futex lock-pi

_____test________________spinlcok______________pv-qspinlcok_____
|futex hash	|	528572 ops	|	573238 ops	|
|futex lock-pi	|	354 ops		|	352 ops		|

scheduler test:
Test how many loops of schedule() can finish within 10 seconds on all cpus.

_____test________________spinlcok______________pv-qspinlcok_____
|schedule() loops|	340890082 	|	331730973	|

kernel compiling test:
build a default linux kernel image to see how long it took

_____test________________spinlcok______________pv-qspinlcok_____
| compiling takes|	22m 		|	22m		|

some notes:
the performace is as good as current spinlock's. in some case better while some cases worse.
But in some other tests(not listed here), we verify the two spinlock's workloads by perf record&report.
pv-qspinlock is light-weight than current spinlock.
This patch series depends on 2 patches:
[patch]powerpc: Implement {cmp}xchg for u8 and u16
[patch]locking/pvqspinlock: Add lock holder CPU argument to pv_wait() from Waiman

Some other patches in Waiman's "locking/pvqspinlock: Fix missed PV wakeup & support PPC" are not applied for now.


Pan Xinhui (6):
  qspinlock: powerpc support qspinlock
  powerpc: pseries/Kconfig: Add qspinlock build config
  powerpc: lib/locks.c: Add cpu yield/wake helper function
  pv-qspinlock: powerpc support pv-qspinlock
  pv-qspinlock: use cmpxchg_release in __pv_queued_spin_unlock
  powerpc: pseries: Add pv-qspinlock build config/make

 arch/powerpc/include/asm/qspinlock.h               |  41 +++++++
 arch/powerpc/include/asm/qspinlock_paravirt.h      |  38 +++++++
 .../powerpc/include/asm/qspinlock_paravirt_types.h |  13 +++
 arch/powerpc/include/asm/spinlock.h                |  31 ++++--
 arch/powerpc/include/asm/spinlock_types.h          |   4 +
 arch/powerpc/kernel/Makefile                       |   1 +
 arch/powerpc/kernel/paravirt.c                     | 121 +++++++++++++++++++++
 arch/powerpc/lib/locks.c                           |  37 +++++++
 arch/powerpc/platforms/pseries/Kconfig             |   9 ++
 arch/powerpc/platforms/pseries/setup.c             |   5 +
 kernel/locking/qspinlock_paravirt.h                |   2 +-
 11 files changed, 289 insertions(+), 13 deletions(-)
 create mode 100644 arch/powerpc/include/asm/qspinlock.h
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt.h
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt_types.h
 create mode 100644 arch/powerpc/kernel/paravirt.c

-- 
2.4.11

WARNING: multiple messages have this Message-ID (diff)
From: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
To: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.orgv,
	virtualization@lists.linux-foundation.org
Cc: peterz@infradead.org, benh@kernel.crashing.org,
	waiman.long@hpe.com,
	root <root@ltcalpine2-lp13.aus.stglabs.ibm.com>,
	mingo@redhat.com, paulus@samba.org, mpe@ellerman.id.au,
	paulmck@linux.vnet.ibm.com
Subject: [PATCH v5 0/6] powerPC/pSeries use pv-qpsinlock as the default spinlock implemention
Date: Thu,  2 Jun 2016 17:22:43 +0800	[thread overview]
Message-ID: <1464859370-5162-1-git-send-email-xinhui.pan@linux.vnet.ibm.com> (raw)

From: root <root@ltcalpine2-lp13.aus.stglabs.ibm.com>

change from v4:
	BUG FIX. thanks boqun reporting this issue.
	struct  __qspinlock has different layout in bigendian mahcine.
	native_queued_spin_unlock() may write value to a wrong address. now fix it.
	
change from v3:
	a big change in [PATCH v4 4/6] pv-qspinlock: powerpc support pv-qspinlock
	no other patch changed.
	and the patch cover letter tilte has changed as only pseries may need use pv-qspinlock, not all powerpc.

	1) __pv_wait will not return until *ptr != val as Waiman gives me a tip.
	2) support lock holder serching by storing cpu number into a hash table(implemented as an array)
	This is because lock_stealing hit too much, up to 10%~20% of all the successful lock(), and avoid
	vcpu slices bounce.
	
change from v2:
	__spin_yeild_cpu() will yield slices to lpar if target cpu is running.
	remove unnecessary rmb() in __spin_yield/wake_cpu.
	__pv_wait() will check the *ptr == val.
	some commit message change

change fome v1:
	separate into 6 pathes from one patch
	some minor code changes.

I do several tests on pseries IBM,8408-E8E with 32cpus, 64GB memory, kernel 4.6
benchmark test results are below.

2 perf tests:
perf bench futex hash
perf bench futex lock-pi

_____test________________spinlcok______________pv-qspinlcok_____
|futex hash	|	528572 ops	|	573238 ops	|
|futex lock-pi	|	354 ops		|	352 ops		|

scheduler test:
Test how many loops of schedule() can finish within 10 seconds on all cpus.

_____test________________spinlcok______________pv-qspinlcok_____
|schedule() loops|	340890082 	|	331730973	|

kernel compiling test:
build a default linux kernel image to see how long it took

_____test________________spinlcok______________pv-qspinlcok_____
| compiling takes|	22m 		|	22m		|

some notes:
the performace is as good as current spinlock's. in some case better while some cases worse.
But in some other tests(not listed here), we verify the two spinlock's workloads by perf record&report.
pv-qspinlock is light-weight than current spinlock.
This patch series depends on 2 patches:
[patch]powerpc: Implement {cmp}xchg for u8 and u16
[patch]locking/pvqspinlock: Add lock holder CPU argument to pv_wait() from Waiman

Some other patches in Waiman's "locking/pvqspinlock: Fix missed PV wakeup & support PPC" are not applied for now.


Pan Xinhui (6):
  qspinlock: powerpc support qspinlock
  powerpc: pseries/Kconfig: Add qspinlock build config
  powerpc: lib/locks.c: Add cpu yield/wake helper function
  pv-qspinlock: powerpc support pv-qspinlock
  pv-qspinlock: use cmpxchg_release in __pv_queued_spin_unlock
  powerpc: pseries: Add pv-qspinlock build config/make

 arch/powerpc/include/asm/qspinlock.h               |  41 +++++++
 arch/powerpc/include/asm/qspinlock_paravirt.h      |  38 +++++++
 .../powerpc/include/asm/qspinlock_paravirt_types.h |  13 +++
 arch/powerpc/include/asm/spinlock.h                |  31 ++++--
 arch/powerpc/include/asm/spinlock_types.h          |   4 +
 arch/powerpc/kernel/Makefile                       |   1 +
 arch/powerpc/kernel/paravirt.c                     | 121 +++++++++++++++++++++
 arch/powerpc/lib/locks.c                           |  37 +++++++
 arch/powerpc/platforms/pseries/Kconfig             |   9 ++
 arch/powerpc/platforms/pseries/setup.c             |   5 +
 kernel/locking/qspinlock_paravirt.h                |   2 +-
 11 files changed, 289 insertions(+), 13 deletions(-)
 create mode 100644 arch/powerpc/include/asm/qspinlock.h
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt.h
 create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt_types.h
 create mode 100644 arch/powerpc/kernel/paravirt.c

-- 
2.4.11

             reply	other threads:[~2016-06-02  9:24 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-02  9:22 Pan Xinhui [this message]
2016-06-02  9:22 ` [PATCH v5 0/6] powerPC/pSeries use pv-qpsinlock as the default spinlock implemention Pan Xinhui
2016-06-02  9:22 ` Pan Xinhui
2016-06-02  9:22   ` Pan Xinhui
2016-06-02  9:22 ` [PATCH v5 1/6] qspinlock: powerpc support qspinlock Pan Xinhui
2016-06-02  9:22   ` Pan Xinhui
2016-06-03  1:32   ` Benjamin Herrenschmidt
2016-06-03  1:32   ` Benjamin Herrenschmidt
2016-06-03  1:32     ` Benjamin Herrenschmidt
2016-06-03  1:32     ` Benjamin Herrenschmidt
2016-06-03  4:10       ` xinhui
2016-06-03  4:33         ` Benjamin Herrenschmidt
2016-06-03  4:33           ` Benjamin Herrenschmidt
2016-06-03  7:02           ` xinhui
2016-06-03  7:02             ` xinhui
2016-06-06 15:59           ` Peter Zijlstra
2016-06-06 15:59             ` Peter Zijlstra
2016-06-06 21:41             ` Benjamin Herrenschmidt
2016-06-06 21:41               ` Benjamin Herrenschmidt
2016-06-21 12:35               ` xinhui
2016-06-21 12:35                 ` xinhui
2016-06-03  4:10       ` xinhui
2016-06-02  9:22 ` [PATCH v5 2/6] powerpc: pseries/Kconfig: Add qspinlock build config Pan Xinhui
2016-06-02  9:22   ` Pan Xinhui
2016-06-02  9:22 ` [PATCH v5 3/6] powerpc: lib/locks.c: Add cpu yield/wake helper function Pan Xinhui
2016-06-02  9:22   ` Pan Xinhui
2016-06-02  9:22 ` [PATCH v5 4/6] pv-qspinlock: powerpc support pv-qspinlock Pan Xinhui
2016-06-02  9:22 ` Pan Xinhui
2016-06-02  9:22 ` [PATCH v5 5/6] pv-qspinlock: use cmpxchg_release in __pv_queued_spin_unlock Pan Xinhui
2016-06-02  9:22 ` Pan Xinhui
2016-06-02  9:22 ` [PATCH v5 6/6] powerpc: pseries: Add pv-qspinlock build config/make Pan Xinhui
2016-06-02  9:22 ` Pan Xinhui
2016-06-02  9:26 [PATCH v5 0/6] powerPC/pSeries use pv-qpsinlock as the default spinlock implemention Pan Xinhui
2016-06-02  9:26 Pan Xinhui
2016-06-02  9:26 ` Pan Xinhui
2016-06-02  9:26   ` Pan Xinhui
2016-06-02  9:33 ` Peter Zijlstra
2016-06-02  9:33   ` Peter Zijlstra
2016-06-02  9:47   ` xinhui
2016-06-02  9:47     ` xinhui
2016-06-02 11:08     ` Peter Zijlstra
2016-06-02 11:08       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1464859370-5162-1-git-send-email-xinhui.pan@linux.vnet.ibm.com \
    --to=xinhui.pan@linux.vnet.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.orgv \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=root@ltcalpine2-lp13.aus.stglabs.ibm.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=waiman.long@hpe.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.