Re: [PATCH 4/4] locking: Introduce smp_cond_acquire()

From: Linus Torvalds <torvalds@linux-foundation.org>
To: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>, Oleg Nesterov <oleg@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	boqun.feng@gmail.com, Jonathan Corbet <corbet@lwn.net>,
	Michal Hocko <mhocko@kernel.org>,
	David Howells <dhowells@redhat.com>
Subject: Re: [PATCH 4/4] locking: Introduce smp_cond_acquire()
Date: Tue, 3 Nov 2015 11:40:24 -0800	[thread overview]
Message-ID: <CA+55aFyJJGaW2RpvDkEjgbd1dYpLNzF7n+QUxkbFdNiCf_xgDg@mail.gmail.com> (raw)
In-Reply-To: <20151103015743.GC29027@linux.vnet.ibm.com>

On Mon, Nov 2, 2015 at 5:57 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
>>
>> Alpha isn't special. And smp_read_barrier_depends() hasn't magically
>> become something new.
>
> The existing control dependencies (READ_ONCE_CTRL() and friends) only
> guarantee ordering against later stores, and not against later loads.

Right. And using "smp_read_barrier_depends()" for them is WRONG.

That's my argument.

Your arguments make no sense.

> Of the weakly ordered architectures, only Alpha fails to respect
> load-to-store control dependencies.

NO IT DOES NOT.

Christ, Paul. I even sent you the alpha architecture manual
information where it explicitly says that there's a dependency
ordering for that case.

There's a reason that "smp_read_barrier_depends()" is called
smp_READ_barrier_depends().

It's a rmb. Really

You have turned it into something else in your mind. But your mind is WRONG.

> I am in India and my Alpha Architecture Manual is in the USA.

I sent you a link to something that should work, and that has the section.

> And they did post a clarification on the web:

So for alpha, you trust a random web posting by a unknown person that
talks about some random problem in an application that we don't even
know what it is.

But you don't trust the architecture manual, and you don't trust the
fact that it si *physically impossible* to not have the
load-to-control-to-store ordering without some kind of magical
nullifying stores that we know that alpha didn't have?

But then magically, you trust the architecture manuals for other
architectures, or just take the "you cannot have smp-visible
speculative stores" on faith.

But alpha is different, and lives in a universe where causality
suddenly doesn't exist.

I really don't understand your logic.

> So 5.6.1.7 apparently does not sanction data dependency ordering.

Exactly why are you arguing against he architecture manual?

>> Paul? I really disagree with how you have totally tried to re-purpose
>> smp_read_barrier_depends() in ways that are insane and make no sense.
>
> I did consider adding a new name, but apparently was lazy that day.
> I would of course be happy to add a separate name.

That is NOT WHAT I WANT AT ALL.

Get rid of the smp_read_barrier_depends(). It doesn't do control
barriers against stores, it has never done that, and they aren't
needed in the first place.

There is no need for a new name. The only thing that there is a need
for is to just realize that alpha isn't magical.

Alpha does have a stupid segmented cache which means that even when
the core is constrained by load-to-load data address dependencies
(because no alpha actually did value predication), the cachelines that
core loads may not be ordered without the memory barrier.

But that "second load may get a stale value from the past" is purely
about loads. You cannot have the same issue happen for stores, because
there's no way a store buffer somehow magically goes backwards in time
and exposes the store before the load that it depended on.

And the architecture manual even *EXPLICITLY* says so. Alpha does
actually have a dependency chain from loads to dependent stores -
through addresses, through data, _and_ through control. It's
documented, but equally importantly, it's just basically impossible to
not have an architecture that does that.

Even if you do some really fancy things like actual value prediction
(which the alpha architecture _allows_, although no alpha ever did,
afaik), that cannot break the load->store dependency. Because even if
the load value was predicted (or any control flow after the load was
predicted), the store cannot be committed and made visible to other
CPU's until after that prediction has been verified.

So there may be bogus values in a store buffer, but those values will
have to be killed with the store instructions that caused them, if any
prediction failed. They won't be visible to other CPU's.

And I say that not because the architecture manual says so (although
it does), but because because such a CPU wouldn't *work*. It wouldn't
be a CPU, it would at most be a fuzzy neural network that generates
cool results that may be interesting, but they won't be dependable or
reliable in the sense that the Linux kernel depends on.

>> That is not a control dependency. If it was, then PPC and ARM would
>> need to make it a real barrier too. I really don't see why you have
>> singled out alpha as the victim of your "let's just randomly change
>> the rules".
>
> Just trying to make this all work on Alpha as well as the other
> architectures...  But if the actual Alpha hardware that is currently
> running recent Linux kernels is more strict than the architecture

.. really. This is specified in the architecture manual. The fact that
you have found some random posting by some random person that says
"you need barriers everywhere" is immaterial. That support blog may
well have been simply a "I don't know what I am talking about, and I
don't know what problem you have, but I do know that the memory model
is really nasty, and adding random barriers will make otherwise
correct code work".

But we don't add random read barriers to make a control-to-store
barrier. Not when the architecture manual explicitly says there is a
dependency chain there, and not when I don't see how you could
possibly even make a valid CPU that doesn't have that dependency.

                Linus