From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753381AbbKLSVl (ORCPT <rfc822;w@1wt.eu>);
	Thu, 12 Nov 2015 13:21:41 -0500
Received: from mail-ig0-f174.google.com ([209.85.213.174]:38210 "EHLO
	mail-ig0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752584AbbKLSVj (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 12 Nov 2015 13:21:39 -0500
MIME-Version: 1.0
In-Reply-To: <20151112070915.GC6314@fixme-laptop.cn.ibm.com>
References: <20151102132901.157178466@infradead.org>
	<20151102134941.005198372@infradead.org>
	<20151103175958.GA4800@redhat.com>
	<20151111093939.GA6314@fixme-laptop.cn.ibm.com>
	<20151111121232.GN17308@twins.programming.kicks-ass.net>
	<20151111193953.GA23515@redhat.com>
	<20151112070915.GC6314@fixme-laptop.cn.ibm.com>
Date: Thu, 12 Nov 2015 10:21:39 -0800
X-Google-Sender-Auth: PwoHZcIVjZfdwBxvd0Vt9-u4suM
Message-ID: <CA+55aFxr4JX8XEF12qCOjbj1r92w8yuHZ0vbUm9jgvY167Svsw@mail.gmail.com>
Subject: Re: [PATCH 4/4] locking: Introduce smp_cond_acquire()
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Boqun Feng <boqun.feng@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>, Peter Zijlstra <peterz@infradead.org>,
        Ingo Molnar <mingo@kernel.org>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Paul McKenney <paulmck@linux.vnet.ibm.com>,
        Jonathan Corbet <corbet@lwn.net>, Michal Hocko <mhocko@kernel.org>,
        David Howells <dhowells@redhat.com>, Will Deacon <will.deacon@arm.com>,
        Michael Ellerman <mpe@ellerman.id.au>,
        Benjamin Herrenschmidt <benh@kernel.crashing.org>,
        Paul Mackerras <paulus@samba.org>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Wed, Nov 11, 2015 at 11:14 PM, Boqun Feng <boqun.feng@gmail.com> wrote:
>
> Hmm.. probably incorrect.. because the ACQUIRE semantics of spin_lock()
> only guarantees that the memory operations following spin_lock() can't
> be reorder before the *LOAD* part of spin_lock() not the *STORE* part,
> i.e. the case below can happen(assuming the spin_lock() is implemented
> as ll/sc loop)
>
>         spin_lock(&lock):
>           r1 = *lock; // LL, r1 == 0
>         o = READ_ONCE(object); // could be reordered here.
>           *lock = 1; // SC

It may be worth noting that at least in theory, not only reads may
pass the store. If the spin-lock is done as

        r1 = *lock   // read-acquire, r1 == 0
        *lock = 1    // SC

then even *writes* inside the locked region might pass up through the
"*lock = 1".

So other CPU's - that haven't taken the spinlock - could see the
modifications inside the critical region before they actually see the
lock itself change.

Now, the point of spin_unlock_wait() (and "spin_is_locked()") should
generally be that you have some external ordering guarantee that
guarantees that the lock has been taken. For example, for the IPC
semaphores, we do either one of:

 (a) get large lock, then - once you hold that lock - wait for each small lock

or

 (b) get small lock, then - once you hold that lock - check that the
largo lock is unlocked

and that's the case we should really worry about.  The other uses of
spin_unlock_wait() should have similar "I have other reasons to know
I've seen that the lock was taken, or will never be taken after this
because XYZ".

This is why powerpc has a memory barrier in "arch_spin_is_locked()".
Exactly so that the "check that the other lock is unlocked" is
guaranteed to be ordered wrt the store that gets the first lock.

It looks like ARM64 gets this wrong and is fundamentally buggy wrt
"spin_is_locked()" (and, as a result, "spin_unlock_wait()").

BUT! And this is a bug BUT:

It should be noted that that is purely an ARM64 bug. Not a bug in our
users. If you have a spinlock where the "get lock write" part of the
lock can be delayed, then you have to have a "arch_spin_is_locked()"
that has the proper memory barriers.

Of course, ARM still hides their architecture manuals in odd places,
so I can't double-check. But afaik, ARM64 store-conditional is "store
exclusive with release", and it has only release semantics, and ARM64
really does have the above bug.

On that note: can anybody point me to the latest ARM64 8.1
architecture manual in pdf form, without the "you have to register"
crap? I thought ARM released it, but all my googling just points to
the idiotic ARM service center that wants me to sign away something
just to see the docs. Which I don't do.

                                   Linus