From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1756281AbcETRAt (ORCPT <rfc822;w@1wt.eu>);
	Fri, 20 May 2016 13:00:49 -0400
Received: from mail-io0-f175.google.com ([209.85.223.175]:36284 "EHLO
	mail-io0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1755727AbcETRAq (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Fri, 20 May 2016 13:00:46 -0400
MIME-Version: 1.0
In-Reply-To: <20160520160436.GQ3205@twins.programming.kicks-ass.net>
References: <20160520053926.GC31084@linux-uzut.site>
	<20160520115819.GF3193@twins.programming.kicks-ass.net>
	<20160520140533.GA20726@insomnia>
	<20160520152149.GH3193@twins.programming.kicks-ass.net>
	<20160520160436.GQ3205@twins.programming.kicks-ass.net>
Date: Fri, 20 May 2016 10:00:45 -0700
X-Google-Sender-Auth: xACWA9fjubCsiStDL9oCeGgBt8A
Message-ID: <CA+55aFyHQqrBAnod2BUA3X88bNdamzLZrFyZcZDLwudxjaib1Q@mail.gmail.com>
Subject: Re: sem_lock() vs qspinlocks
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Boqun Feng <boqun.feng@gmail.com>, Davidlohr Bueso <dave@stgolabs.net>,
        Manfred Spraul <manfred@colorfullife.com>,
        Waiman Long <Waiman.Long@hpe.com>, Ingo Molnar <mingo@kernel.org>,
        ggherdovich@suse.com, Mel Gorman <mgorman@techsingularity.net>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Paul McKenney <paulmck@linux.vnet.ibm.com>,
        Will Deacon <will.deacon@arm.com>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Fri, May 20, 2016 at 9:04 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> +        * queued_spin_lock_slowpath() can ACQUIRE the lock before
> +        * issuing the unordered store that sets _Q_LOCKED_VAL.

Ugh. This was my least favorite part of the queued locks, and I really
liked the completely unambiguous semantics of the x86 atomics-based
versions we used to have.

But I guess we're stuck with it.

That said, it strikes me that almost all of the users of
"spin_is_locked()" are using it for verification purposes (not locking
correctness), and that the people who are playing games with locking
correctness are few and already have to play *other* games anyway.

See for example "ipc_smp_acquire__after_spin_is_unlocked()", which has
a big comment atop of it that now becomes nonsensical with this patch.

So I do wonder if we should make that smp_mb() be something the
*caller* has to do, and document rules for it. IOW, introduce a new
spinlock primitive called "spin_lock_synchronize()", and then spinlock
implementations that have this non-atomic behavior with an unordered
store would do something like

    static inline void queued_spin_lock_synchronize(struct qspinlock
*a, struct qspinlock *b)
    {
        smp_mb();
    }

and then we'd document that *if* yuou need ordering guarantees between

   spin_lock(a);
   .. spin_is_locked/spin_wait_lock(b) ..

you have to have a

    spin_lock_synchronize(a, b);

in between.

A spin-lock implementation with the old x86 atomic semantics would
make it a no-op.

We should also introduce something like that
"splin_lock_acquire_after_unlock()" so that the ipc/sem.c behavior
would be documented too. Partly because some spinlock implementations
might have stronger unlock ordering and that could be a no-op for
them), but mostly for documentation of the rules.

Now, I'd take Peter's patch as-is, because I don't think any of this
matters from a *performance* standpoint, and Peter's patch is much
smaller and simpler.

But the reason I think it might be a good thing to introduce those
spin_lock_synchronize() and splin_lock_acquire_after_unlock() concepts
would be to make it very very clear what those subtle implementations
in mutexes and the multi-level locks in the ipc layer are doing and
what they rely on.

Comments? There are arguments for Peter's simple approach too ("robust
and simpler interface" vs "make our requirements explicit").

               Linus