From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759925AbbIDPZ3 (ORCPT ); Fri, 4 Sep 2015 11:25:29 -0400 Received: from casper.infradead.org ([85.118.1.10]:36111 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759889AbbIDPZ1 (ORCPT ); Fri, 4 Sep 2015 11:25:27 -0400 Date: Fri, 4 Sep 2015 17:25:23 +0200 From: Peter Zijlstra To: Linus Torvalds Cc: Dave Chinner , Linux Kernel Mailing List , Waiman Long , Ingo Molnar Subject: Re: [4.2, Regression] Queued spinlocks cause major XFS performance regression Message-ID: <20150904152523.GR18673@twins.programming.kicks-ass.net> References: <20150904054820.GY3902@dastard> <20150904071143.GZ3902@dastard> <20150904082954.GB3902@dastard> <20150904151427.GG18489@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150904151427.GG18489@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 04, 2015 at 05:14:27PM +0200, Peter Zijlstra wrote: > On Fri, Sep 04, 2015 at 08:05:16AM -0700, Linus Torvalds wrote: > > So at the very *minimum*, that second issue should be fixed, and the > > loop in virt_queued_spin_lock() should look something like > > > > do { > > while (READ_ONCE(lock->val) != 0) > > cpu_relax(); > > } while (atomic_cmpxchg(&lock->val, 0, _Q_LOCKED_VAL) != 0); > > > > which at least has a chance in hell of behaving well on the bus and in > > a HT environment. > > True. Something like so... --- Subject: locking: Fix virt test-and-set lock implementation Dave ran into horrible performance on a VM without PARAVIRT_SPINLOCKS set and Linus noted that the test-and-set implementation was retarded. One should spin on the variable with a load, not a rmw. While there, remove the queued from the name, as the lock isn't queued at all, but a simple test-and-set. Reported-by: Dave Chinner Suggested-by: Linus Torvalds Signed-off-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/qspinlock.h | 16 ++++++++++++---- include/asm-generic/qspinlock.h | 4 ++-- kernel/locking/qspinlock.c | 2 +- 3 files changed, 15 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/qspinlock.h b/arch/x86/include/asm/qspinlock.h index 9d51fae1cba3..8dde3bdc4a05 100644 --- a/arch/x86/include/asm/qspinlock.h +++ b/arch/x86/include/asm/qspinlock.h @@ -39,15 +39,23 @@ static inline void queued_spin_unlock(struct qspinlock *lock) } #endif -#define virt_queued_spin_lock virt_queued_spin_lock +#define virt_spin_lock virt_spin_lock -static inline bool virt_queued_spin_lock(struct qspinlock *lock) +static inline bool virt_spin_lock(struct qspinlock *lock) { if (!static_cpu_has(X86_FEATURE_HYPERVISOR)) return false; - while (atomic_cmpxchg(&lock->val, 0, _Q_LOCKED_VAL) != 0) - cpu_relax(); + /* + * On hypervisors without PARAVIRT_SPINLOCKS support we fall + * back to a Test-and-Set spinlock, because fair locks have + * horrible lock 'holder' preemption issues. + */ + + do { + while (atomic_read(&lock->val) != 0) + cpu_relax(); + } while (atomic_cmpxchg(&lock->val, 0, _Q_LOCKED_VAL) != 0); return true; } diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h index 83bfb87f5bf1..e2aadbc7151f 100644 --- a/include/asm-generic/qspinlock.h +++ b/include/asm-generic/qspinlock.h @@ -111,8 +111,8 @@ static inline void queued_spin_unlock_wait(struct qspinlock *lock) cpu_relax(); } -#ifndef virt_queued_spin_lock -static __always_inline bool virt_queued_spin_lock(struct qspinlock *lock) +#ifndef virt_spin_lock +static __always_inline bool virt_spin_lock(struct qspinlock *lock) { return false; } diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index 337c8818541d..87e9ce6a63c5 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -289,7 +289,7 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) if (pv_enabled()) goto queue; - if (virt_queued_spin_lock(lock)) + if (virt_spin_lock(lock)) return; /*