From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751783AbbKJTrA (ORCPT ); Tue, 10 Nov 2015 14:47:00 -0500 Received: from g1t6220.austin.hp.com ([15.73.96.84]:57822 "EHLO g1t6220.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751142AbbKJTqq (ORCPT ); Tue, 10 Nov 2015 14:46:46 -0500 Message-ID: <564249A3.3070901@hpe.com> Date: Tue, 10 Nov 2015 14:46:43 -0500 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Peter Zijlstra CC: Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, Scott J Norton , Douglas Hatch , Davidlohr Bueso Subject: Re: [PATCH tip/locking/core v10 6/7] locking/pvqspinlock: Allow limited lock stealing References: <1447114167-47185-1-git-send-email-Waiman.Long@hpe.com> <1447114167-47185-7-git-send-email-Waiman.Long@hpe.com> <20151110160343.GE17308@twins.programming.kicks-ass.net> In-Reply-To: <20151110160343.GE17308@twins.programming.kicks-ass.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/10/2015 11:03 AM, Peter Zijlstra wrote: > On Mon, Nov 09, 2015 at 07:09:26PM -0500, Waiman Long wrote: >> @@ -291,7 +292,7 @@ static __always_inline void __pv_wait_head(struct qspinlock *lock, >> void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) >> { >> struct mcs_spinlock *prev, *next, *node; >> - u32 new, old, tail; >> + u32 new, old, tail, locked; >> int idx; >> >> BUILD_BUG_ON(CONFIG_NR_CPUS>= (1U<< _Q_TAIL_CPU_BITS)); >> @@ -431,11 +432,25 @@ queue: >> * sequentiality; this is because the set_locked() function below >> * does not imply a full barrier. >> * >> + * The PV pv_wait_head_or_lock function, if active, will acquire >> + * the lock and return a non-zero value. So we have to skip the >> + * smp_load_acquire() call. As the next PV queue head hasn't been >> + * designated yet, there is no way for the locked value to become >> + * _Q_SLOW_VAL. So both the set_locked() and the >> + * atomic_cmpxchg_relaxed() calls will be safe. >> + * >> + * If PV isn't active, 0 will be returned instead. >> + * >> */ >> - pv_wait_head(lock, node); >> - while ((val = smp_load_acquire(&lock->val.counter))& _Q_LOCKED_PENDING_MASK) >> + locked = val = pv_wait_head_or_lock(lock, node); >> + if (locked) >> + goto reset_tail_or_wait_next; >> + >> + while ((val = smp_load_acquire(&lock->val.counter)) >> + & _Q_LOCKED_PENDING_MASK) >> cpu_relax(); >> >> +reset_tail_or_wait_next: >> /* >> * claim the lock: >> * >> @@ -447,8 +462,12 @@ queue: >> * to grab the lock. >> */ >> for (;;) { >> - if (val != tail) { >> - set_locked(lock); >> + /* >> + * The lock value may or may not have the _Q_LOCKED_VAL bit set. >> + */ >> + if ((val& _Q_TAIL_MASK) != tail) { >> + if (!locked) >> + set_locked(lock); >> break; >> } >> /* > How about this instead? If we've already got _Q_LOCKED_VAL set, issuing > that store again isn't much of a problem, the cacheline is already hot > and we own it and its a regular store not an atomic. > > @@ -432,10 +433,13 @@ void queued_spin_lock_slowpath(struct qs > * does not imply a full barrier. > * > */ > - pv_wait_head(lock, node); > + if ((val = pv_wait_head_or_lock(lock, node))) > + goto locked; > + > while ((val = smp_load_acquire(&lock->val.counter))& _Q_LOCKED_PENDING_MASK) > cpu_relax(); > > +locked: > /* > * claim the lock: > * > @@ -447,7 +451,8 @@ void queued_spin_lock_slowpath(struct qs > * to grab the lock. > */ > for (;;) { > - if (val != tail) { > + /* In the PV case we might already have _Q_LOCKED_VAL set */ > + if ((val& _Q_TAIL_MASK) != tail) { > set_locked(lock); > break; > } > That is certainly fine. I was doing that originally, but then change it to add an additional if. BTW, I have a process question. Should I just resend the patch 6 or should I resend the whole series? I do have a couple of bugs in the (_Q_PENDING_BITS != 8) part of the patch that I need to fix too. Cheers, Longman