From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752243AbbKPV6x (ORCPT <rfc822;w@1wt.eu>);
	Mon, 16 Nov 2015 16:58:53 -0500
Received: from mail-io0-f196.google.com ([209.85.223.196]:35856 "EHLO
	mail-io0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751389AbbKPV6u (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Mon, 16 Nov 2015 16:58:50 -0500
MIME-Version: 1.0
In-Reply-To: <20151116162452.GD1999@arm.com>
References: <20151102132901.157178466@infradead.org>
	<20151102134941.005198372@infradead.org>
	<20151103175958.GA4800@redhat.com>
	<20151111093939.GA6314@fixme-laptop.cn.ibm.com>
	<20151111121232.GN17308@twins.programming.kicks-ass.net>
	<20151111193953.GA23515@redhat.com>
	<20151112070915.GC6314@fixme-laptop.cn.ibm.com>
	<CA+55aFxr4JX8XEF12qCOjbj1r92w8yuHZ0vbUm9jgvY167Svsw@mail.gmail.com>
	<20151116155658.GW17308@twins.programming.kicks-ass.net>
	<20151116160445.GK11639@twins.programming.kicks-ass.net>
	<20151116162452.GD1999@arm.com>
Date: Mon, 16 Nov 2015 13:58:49 -0800
X-Google-Sender-Auth: ZTIXi7ZOk0W50Wlx1D4TDhH2oSY
Message-ID: <CA+55aFyvMBgi2GYpPPe8dMho2+TdSwUg1h-4JNZ_hQse_WqfKg@mail.gmail.com>
Subject: Re: [PATCH 4/4] locking: Introduce smp_cond_acquire()
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>, Boqun Feng <boqun.feng@gmail.com>,
        Oleg Nesterov <oleg@redhat.com>, Ingo Molnar <mingo@kernel.org>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Paul McKenney <paulmck@linux.vnet.ibm.com>,
        Jonathan Corbet <corbet@lwn.net>, Michal Hocko <mhocko@kernel.org>,
        David Howells <dhowells@redhat.com>,
        Michael Ellerman <mpe@ellerman.id.au>,
        Benjamin Herrenschmidt <benh@kernel.crashing.org>,
        Paul Mackerras <paulus@samba.org>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Mon, Nov 16, 2015 at 8:24 AM, Will Deacon <will.deacon@arm.com> wrote:
>
> ... or we upgrade spin_unlock_wait to a LOCK operation, which might be
> slightly cheaper than spin_lock()+spin_unlock().

So traditionally the real concern has been the cacheline ping-pong
part of spin_unlock_wait(). I think adding a memory barrier (that
doesn't force any exclusive states, just ordering) to it is fine, but
I don't think we want to necessarily have it have to get the cacheline
into exclusive state.

Because if spin_unlock_wait() ends up having to get the spinlock
cacheline (for example, by writing the same value back with a SC), I
don't think spin_unlock_wait() will really be all that much cheaper
than just getting the spinlock, and in that case we shouldn't play
complicated ordering games.

On another issue:

I'm also looking at the ARM documentation for strx, and the
_documentation_ says that it has no stronger ordering than a "store
release", but I'm starting to wonder if that is actually true.

Because I do end up thinking that it does have the same "control
dependency" to all subsequent writes (but not reads). So reads after
the SC can percolate up, but I think writes are restricted.

Why? In order for the SC to be able to return success, the write
itself may not have been actually done yet, but the cacheline for the
write must have successfully be turned into exclusive ownership.
Agreed?

That means that by the time a SC returns success, no other CPU can see
the old value of the spinlock any more. So by the time any subsequent
stores in the locked region can be visible to any other CPU's, the
locked value of the lock itself has to be visible too.

Agreed?

So I think that in effect, when a spinlock is implemnted with LL/SC,
the loads inside the locked region are only ordered wrt the acquire on
the LL, but the stores can be considered ordered wrt the SC.

No?

So I think a _successful_ SC - is still more ordered than just any
random store with release consistency.

Of course, I'm not sure that actually *helps* us, because I think the
problem tends to be loads in the locked region moving up earlier than
the actual store that sets the lock, but maybe it makes some
difference.

                Linus