From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751507AbdBMDSU (ORCPT ); Sun, 12 Feb 2017 22:18:20 -0500 Received: from mail-pf0-f194.google.com ([209.85.192.194]:35706 "EHLO mail-pf0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751229AbdBMDST (ORCPT ); Sun, 12 Feb 2017 22:18:19 -0500 Date: Mon, 13 Feb 2017 11:19:43 +0800 From: Boqun Feng To: panxinhui Cc: Waiman Long , Xinhui Pan , Peter Zijlstra , Ingo Molnar , linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] locking/pvqspinlock: Relax cmpxchg's to improve performance on some archs Message-ID: <20170213031943.GE9178@tardis.cn.ibm.com> References: <1482697561-23848-1-git-send-email-longman@redhat.com> <778926a5-cf9f-586b-6bc4-b9453d88aabb@redhat.com> <1f8d7eaf-ac3b-81b8-0d8f-12f60436cc48@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="3XA6nns4nE4KvaS/" Content-Disposition: inline In-Reply-To: <1f8d7eaf-ac3b-81b8-0d8f-12f60436cc48@linux.vnet.ibm.com> User-Agent: Mutt/1.7.2 (2016-11-26) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --3XA6nns4nE4KvaS/ Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Feb 13, 2017 at 10:24:38AM +0800, panxinhui wrote: >=20 >=20 > =E5=9C=A8 2017/2/10 =E4=B8=8A=E5=8D=884:53, Waiman Long =E5=86=99=E9=81= =93: > > On 02/07/2017 10:39 PM, Xinhui Pan wrote: > >> > >> > >> 2016-12-26 4:26 GMT+08:00 Waiman Long >: > >> > >> A number of cmpxchg calls in qspinlock_paravirt.h were replaced by= more > >> relaxed versions to improve performance on architectures that use = LL/SC. > >> > >> All the locking related cmpxchg's are replaced with the _acquire > >> variants: > >> - pv_queued_spin_steal_lock() > >> - trylock_clear_pending() > >> > >> The cmpxchg's related to hashing are replaced by either by the _re= lease > >> or the _relaxed variants. See the inline comment for details. > >> > >> Signed-off-by: Waiman Long > > >> > >> v1->v2: > >> - Add comments in changelog and code for the rationale of the ch= ange. > >> > >> --- > >> kernel/locking/qspinlock_paravirt.h | 50 ++++++++++++++++++++++++= ------------- > >> 1 file changed, 33 insertions(+), 17 deletions(-) > >> > >> > >> @@ -323,8 +329,14 @@ static void pv_wait_node(struct mcs_spinlock = *node, struct mcs_spinlock *prev) > >> * If pv_kick_node() changed us to vcpu_hashed, re= tain that > >> * value so that pv_wait_head_or_lock() knows to n= ot also try > >> * to hash this lock. > >> + * > >> + * The smp_store_mb() and control dependency above= will ensure > >> + * that state change won't happen before that. Syn= chronizing > >> + * with pv_kick_node() wrt hashing by this waiter = or by the > >> + * lock holder is done solely by the state variabl= e. There is > >> + * no other ordering requirement. > >> */ > >> - cmpxchg(&pn->state, vcpu_halted, vcpu_running); > >> + cmpxchg_relaxed(&pn->state, vcpu_halted, vcpu_runn= ing); > >> > >> /* > >> * If the locked flag is still not set after wakeu= p, it is a > >> @@ -360,9 +372,12 @@ static void pv_kick_node(struct qspinlock *lo= ck, struct mcs_spinlock *node) > >> * pv_wait_node(). If OTOH this fails, the vCPU was runnin= g and will > >> * observe its next->locked value and advance itself. > >> * > >> - * Matches with smp_store_mb() and cmpxchg() in pv_wait_no= de() > >> + * Matches with smp_store_mb() and cmpxchg_relaxed() in pv= _wait_node(). > >> + * A release barrier is used here to ensure that node->loc= ked is > >> + * always set before changing the state. See comment in pv= _wait_node(). > >> */ > >> - if (cmpxchg(&pn->state, vcpu_halted, vcpu_hashed) !=3D vcp= u_halted) > >> + if (cmpxchg_release(&pn->state, vcpu_halted, vcpu_hashed) > >> + !=3D vcpu_halted) > >> return; > >> > >> hi, Waiman > >> We can't use _release here, a full barrier is needed. > >> > >> There is pv_kick_node vs pv_wait_head_or_lock > >> > >> [w] l->locked =3D _Q_SLOW_VAL //reordered here =20 > >> = if (READ_ONCE(pn->state) =3D=3D vcpu_hashed) //False. > >> = lp =3D (struct qspinlock **)1; > >> > >> [STORE] pn->state =3D vcpu_hashed lp =3D pv_has= h(lock, pn); > >> pv_hash() = if (xchg(&l->locked, _Q_SLOW_VAL) =3D=3D 0) // fasle, not unhashed. > >> > >> Then the same lock has hashed twice but only unhashed once. So at last= as the hash table grows big, we hit RCU stall. > >> > >> I hit RCU stall when I run netperf benchmark > >> > >> thanks > >> xinhui > >> > >> > >> -- > >> 1.8.3.1 > >> > >> > > Yes, I know I am being too aggressive in this patch. I am going to tone= it down a bit. I just don't have time to run a performance test on PPC sys= tem to verify the gain yet. I am planning to send an updated patch soon. > >=20 > hi, All >=20 > I guess I have found the scenario that causes the RCU stall. >=20 Yes, I believe that's the case, as the comment before smp_store_mb() in pv_wait_node states. Good show! Regards, Boqun > pv_wait_node =09 > [L] pn->state // this load is reordered from cmxchg_release. > smp_store_mb(pn->state, vcpu_halted); =09 > if (!READ_ONCE(node->locked)) =09 > arch_mcs_spin_unlock_contended(&next->locked); > pv_kick_node > [-L]cmpxchg_release(&pn->state, vcpu_halted, vcpu_hashed) > //cmpxchg_release fails, so pn->state keep as it is. > pv_wait(&pn->state, vcpu_halted); > //on PPC, It will not return until pn->state !=3D vcpu_halted. > =09 > And when rcu stall hit, I fire an BUG(), and enter debug mode, it seems m= ost cpus are in pv_wait... >=20 > So the soltuion to solve this problems is simple, keep the cmpxchg as it = is in pv_kick_node, cmpxchg on ppc provides full barriers. >=20 > thanks > xinhui >=20 > > Cheers, > > Longman > >=20 >=20 --3XA6nns4nE4KvaS/ Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEj5IosQTPz8XU1wRHSXnow7UH+rgFAlihJckACgkQSXnow7UH +riNQQgAmsOKOWfqorCkviqtlRJKJdyVXmD1VXlsy/XBuU2SsLut6yWG+YKGghfZ 3NanLHryMCN576wMc48rasioOpKtZ9mr4zha90xIn7GtXWSHowq3RsGFPzn3IaVi Xdq+6Ewp32eHIZdIz/OEb8YgrjF4jP2JfQ8u8b7YZLBAnmZO4oiUlrvgWuTI96O1 h3VlddTiTz270Md/xKMXVKBLuA0vEzuZ/OSfglFRzUATLl0O2dEwzSddj91Tvz/Y C1mep+WG7d4xy8POcZ88817B6oKTSr2DENJtdXo42zZSIgZ6TpG+/acJH+8GaZmi 4XfbfjrMRQxR5Zil7V4KCaCGB8rbYg== =HcbA -----END PGP SIGNATURE----- --3XA6nns4nE4KvaS/--