From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1751535AbcETVG2 (ORCPT <rfc822;w@1wt.eu>);
	Fri, 20 May 2016 17:06:28 -0400
Received: from bombadil.infradead.org ([198.137.202.9]:49655 "EHLO
	bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751471AbcETVG1 (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Fri, 20 May 2016 17:06:27 -0400
Date: Fri, 20 May 2016 23:06:18 +0200
From: Peter Zijlstra <peterz@infradead.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Boqun Feng <boqun.feng@gmail.com>, Davidlohr Bueso <dave@stgolabs.net>,
        Manfred Spraul <manfred@colorfullife.com>,
        Waiman Long <Waiman.Long@hpe.com>, Ingo Molnar <mingo@kernel.org>,
        ggherdovich@suse.com, Mel Gorman <mgorman@techsingularity.net>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Paul McKenney <paulmck@linux.vnet.ibm.com>,
        Will Deacon <will.deacon@arm.com>
Subject: Re: sem_lock() vs qspinlocks
Message-ID: <20160520210618.GK3193@twins.programming.kicks-ass.net>
References: <20160520053926.GC31084@linux-uzut.site>
 <20160520115819.GF3193@twins.programming.kicks-ass.net>
 <20160520140533.GA20726@insomnia>
 <20160520152149.GH3193@twins.programming.kicks-ass.net>
 <20160520160436.GQ3205@twins.programming.kicks-ass.net>
 <CA+55aFyHQqrBAnod2BUA3X88bNdamzLZrFyZcZDLwudxjaib1Q@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CA+55aFyHQqrBAnod2BUA3X88bNdamzLZrFyZcZDLwudxjaib1Q@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2012-12-30)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Fri, May 20, 2016 at 10:00:45AM -0700, Linus Torvalds wrote:
> On Fri, May 20, 2016 at 9:04 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> > +        * queued_spin_lock_slowpath() can ACQUIRE the lock before
> > +        * issuing the unordered store that sets _Q_LOCKED_VAL.
> 
> Ugh. This was my least favorite part of the queued locks, and I really
> liked the completely unambiguous semantics of the x86 atomics-based
> versions we used to have.
> 
> But I guess we're stuck with it.

Yeah, I wasn't too happy either when I realized it today. We _could_
make these atomic ops, but then we're just making things slower for this
one weird case :/

> That said, it strikes me that almost all of the users of
> "spin_is_locked()" are using it for verification purposes (not locking
> correctness),

Right; although I feel those people should be using
lockdep_assert_held() for this instead. That not only compiles out when
!LOCKDEP but also asserts the current task/cpu is the lock owner, not
someone else.

> and that the people who are playing games with locking
> correctness are few and already have to play *other* games anyway.
> 
> See for example "ipc_smp_acquire__after_spin_is_unlocked()", which has
> a big comment atop of it that now becomes nonsensical with this patch.

Not quite; we still need that I think.

We now have:

	spin_lock(A);
	smp_mb();
	while (!spin_is_locked(B))
		cpu_relax();
	smp_rmb();

And that control dependency together with the rmb form a load-acquire on
the unlocked B, which matches the release of the spin_unlock(B) and
ensures we observe the whole previous critical section we waited for.

The new smp_mb() doesn't help with that.

> Now, I'd take Peter's patch as-is, because I don't think any of this
> matters from a *performance* standpoint, and Peter's patch is much
> smaller and simpler.

I would suggest you do this and also mark it for stable v4.2 and later.

> But the reason I think it might be a good thing to introduce those
> spin_lock_synchronize() and splin_lock_acquire_after_unlock() concepts
> would be to make it very very clear what those subtle implementations
> in mutexes and the multi-level locks in the ipc layer are doing and
> what they rely on.

We can always do the fancy stuff on top, but that isn't going to need
backporting to all stable trees, this is.

I'll think a little more on the explicit document vs simple thing.