linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: linuxppc-dev@lists.ozlabs.org, Waiman Long <longman@redhat.com>
Cc: Anton Blanchard <anton@ozlabs.org>,
	Boqun Feng <boqun.feng@gmail.com>,
	kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	virtualization@lists.linux-foundation.org,
	Will Deacon <will@kernel.org>
Subject: Re: [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks
Date: Tue, 07 Jul 2020 15:57:06 +1000	[thread overview]
Message-ID: <1594101082.hfq9x5yact.astroid@bobo.none> (raw)
In-Reply-To: <24f75d2c-60cd-2766-4aab-1a3b1c80646e@redhat.com>

Excerpts from Waiman Long's message of July 7, 2020 4:39 am:
> On 7/6/20 12:35 AM, Nicholas Piggin wrote:
>> v3 is updated to use __pv_queued_spin_unlock, noticed by Waiman (thank you).
>>
>> Thanks,
>> Nick
>>
>> Nicholas Piggin (6):
>>    powerpc/powernv: must include hvcall.h to get PAPR defines
>>    powerpc/pseries: move some PAPR paravirt functions to their own file
>>    powerpc: move spinlock implementation to simple_spinlock
>>    powerpc/64s: implement queued spinlocks and rwlocks
>>    powerpc/pseries: implement paravirt qspinlocks for SPLPAR
>>    powerpc/qspinlock: optimised atomic_try_cmpxchg_lock that adds the
>>      lock hint
>>
>>   arch/powerpc/Kconfig                          |  13 +
>>   arch/powerpc/include/asm/Kbuild               |   2 +
>>   arch/powerpc/include/asm/atomic.h             |  28 ++
>>   arch/powerpc/include/asm/paravirt.h           |  89 +++++
>>   arch/powerpc/include/asm/qspinlock.h          |  91 ++++++
>>   arch/powerpc/include/asm/qspinlock_paravirt.h |   7 +
>>   arch/powerpc/include/asm/simple_spinlock.h    | 292 +++++++++++++++++
>>   .../include/asm/simple_spinlock_types.h       |  21 ++
>>   arch/powerpc/include/asm/spinlock.h           | 308 +-----------------
>>   arch/powerpc/include/asm/spinlock_types.h     |  17 +-
>>   arch/powerpc/lib/Makefile                     |   3 +
>>   arch/powerpc/lib/locks.c                      |  12 +-
>>   arch/powerpc/platforms/powernv/pci-ioda-tce.c |   1 +
>>   arch/powerpc/platforms/pseries/Kconfig        |   5 +
>>   arch/powerpc/platforms/pseries/setup.c        |   6 +-
>>   include/asm-generic/qspinlock.h               |   4 +
>>   16 files changed, 577 insertions(+), 322 deletions(-)
>>   create mode 100644 arch/powerpc/include/asm/paravirt.h
>>   create mode 100644 arch/powerpc/include/asm/qspinlock.h
>>   create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt.h
>>   create mode 100644 arch/powerpc/include/asm/simple_spinlock.h
>>   create mode 100644 arch/powerpc/include/asm/simple_spinlock_types.h
>>
> This patch looks OK to me.

Thanks for reviewing and testing.

> I had run some microbenchmark on powerpc system with or w/o the patch.
> 
> On a 2-socket 160-thread SMT4 POWER9 system (not virtualized):
> 
> 5.8.0-rc4
> =========
> 
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 160, Min/Mean/Max = 77,665/90,153/106,895
> Threads = 160, Total Rate = 1,441,759 op/s; Percpu Rate = 9,011 op/s
> 
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 160, Min/Mean/Max = 47,879/53,807/63,689
> Threads = 160, Total Rate = 860,192 op/s; Percpu Rate = 5,376 op/s
> 
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 80, Min/Mean/Max = 242,907/319,514/463,161
> Threads = 80, Total Rate = 2,555 kop/s; Percpu Rate = 32 kop/s
> 
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 80, Min/Mean/Max = 146,161/187,474/259,270
> Threads = 80, Total Rate = 1,498 kop/s; Percpu Rate = 19 kop/s
> 
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 40, Min/Mean/Max = 646,639/1,000,817/1,455,205
> Threads = 40, Total Rate = 4,001 kop/s; Percpu Rate = 100 kop/s
> 
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 40, Min/Mean/Max = 402,165/597,132/814,555
> Threads = 40, Total Rate = 2,388 kop/s; Percpu Rate = 60 kop/s
> 
> 5.8.0-rc4-qlock+
> ================
> 
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 160, Min/Mean/Max = 123,835/124,580/124,587
> Threads = 160, Total Rate = 1,992 kop/s; Percpu Rate = 12 kop/s
> 
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 160, Min/Mean/Max = 254,210/264,714/276,784
> Threads = 160, Total Rate = 4,231 kop/s; Percpu Rate = 26 kop/s
> 
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 80, Min/Mean/Max = 599,715/603,397/603,450
> Threads = 80, Total Rate = 4,825 kop/s; Percpu Rate = 60 kop/s
> 
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 80, Min/Mean/Max = 492,687/525,224/567,456
> Threads = 80, Total Rate = 4,199 kop/s; Percpu Rate = 52 kop/s
> 
> Running locktest with spinlock [runtime = 10s, load = 1]
> Threads = 40, Min/Mean/Max = 1,325,623/1,325,628/1,325,636
> Threads = 40, Total Rate = 5,299 kop/s; Percpu Rate = 132 kop/s
> 
> Running locktest with rwlock [runtime = 10s, r% = 50%, load = 1]
> Threads = 40, Min/Mean/Max = 1,249,731/1,292,977/1,342,815
> Threads = 40, Total Rate = 5,168 kop/s; Percpu Rate = 129 kop/s
> 
> On systems on large number of cpus, qspinlock lock is faster and more fair.
> 
> With some tuning, we may be able to squeeze out more performance.

Yes, powerpc could certainly get more performance out of the slow
paths, and then there are a few parameters to tune.

We don't have a good alternate patching for function calls yet, but
that would be something to do for native vs pv.

And then there seem to be one or two tunable parameters we could
experiment with.

The paravirt locks may need a bit more tuning. Some simple testing
under KVM shows we might be a bit slower in some cases. Whether this
is fairness or something else I'm not sure. The current simple pv
spinlock code can do a directed yield to the lock holder CPU, whereas 
the pv qspl here just does a general yield. I think we might actually
be able to change that to also support directed yield. Though I'm
not sure if this is actually the cause of the slowdown yet.

Thanks,
Nick

  reply	other threads:[~2020-07-07  5:57 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-06  4:35 [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks Nicholas Piggin
2020-07-06  4:35 ` [PATCH v3 1/6] powerpc/powernv: must include hvcall.h to get PAPR defines Nicholas Piggin
2020-07-09 10:05   ` Michael Ellerman
2020-07-06  4:35 ` [PATCH v3 2/6] powerpc/pseries: move some PAPR paravirt functions to their own file Nicholas Piggin
2020-07-09 10:11   ` Michael Ellerman
2020-07-06  4:35 ` [PATCH v3 3/6] powerpc: move spinlock implementation to simple_spinlock Nicholas Piggin
2020-07-09 10:15   ` Michael Ellerman
2020-07-06  4:35 ` [PATCH v3 4/6] powerpc/64s: implement queued spinlocks and rwlocks Nicholas Piggin
2020-07-09 10:20   ` Michael Ellerman
2020-07-09 10:33     ` Peter Zijlstra
2020-07-23 14:37   ` Michal Suchánek
2020-07-06  4:35 ` [PATCH v3 5/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR Nicholas Piggin
2020-07-09 10:53   ` Michael Ellerman
2020-07-09 11:03     ` Peter Zijlstra
2020-07-09 16:06     ` Waiman Long
2020-07-23 14:00       ` Peter Zijlstra
2020-07-23 18:32         ` Waiman Long
2020-07-23 18:47           ` peterz
2020-07-23 19:04             ` Waiman Long
2020-07-23 19:58               ` peterz
2020-07-23 20:30                 ` Segher Boessenkool
2020-07-23 21:58                 ` Waiman Long
2020-07-24  8:16             ` Will Deacon
2020-07-24 19:10               ` Waiman Long
2020-07-25  3:02                 ` Waiman Long
2020-07-25 17:26                 ` Peter Zijlstra
2020-07-25 17:36                   ` Waiman Long
2020-07-23 14:09     ` Nicholas Piggin
2020-07-06  4:35 ` [PATCH v3 6/6] powerpc/qspinlock: optimised atomic_try_cmpxchg_lock that adds the lock hint Nicholas Piggin
2020-07-06 18:39 ` [PATCH v3 0/6] powerpc: queued spinlocks and rwlocks Waiman Long
2020-07-07  5:57   ` Nicholas Piggin [this message]
2020-07-08  3:33     ` Waiman Long
2020-07-08  5:10       ` Nicholas Piggin
2020-07-08 23:50         ` Waiman Long
2020-07-08 23:58           ` Waiman Long
2020-07-08  8:32       ` Peter Zijlstra
2020-07-08 23:53         ` Waiman Long
2020-07-08  8:41     ` Peter Zijlstra
2020-07-08 23:54       ` Waiman Long
2020-07-09  8:31         ` Peter Zijlstra
2020-07-21 11:20           ` Nicholas Piggin
2020-07-21 11:08       ` Nicholas Piggin
2020-07-21 14:36         ` Waiman Long
2020-07-23 13:30           ` Nicholas Piggin
2020-07-23 14:29             ` Waiman Long
2020-07-23 16:12               ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1594101082.hfq9x5yact.astroid@bobo.none \
    --to=npiggin@gmail.com \
    --cc=anton@ozlabs.org \
    --cc=boqun.feng@gmail.com \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).