linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Will Deacon <will@kernel.org>
To: Hector Martin <marcan@marcan.st>
Cc: Linux ARM <linux-arm-kernel@lists.infradead.org>,
	Marc Zyngier <maz@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Boqun Feng <boqun.feng@gmail.com>,
	Catalin Marinas <catalin.marinas@arm.com>
Subject: Re: LSE atomic op ordering is weaker than intended?
Date: Wed, 3 Mar 2021 15:36:19 +0000	[thread overview]
Message-ID: <20210303153619.GA19247@willie-the-truck> (raw)
In-Reply-To: <90ea3e27-b2ef-41ac-75c7-5f0686603b2a@marcan.st>

Hi Hector,

On Wed, Mar 03, 2021 at 10:05:19PM +0900, Hector Martin wrote:
> While yak shaving the AIC driver ordering minutiae, I came across this.
> 
> atomic_t.txt describes "fully ordered" atomic ops as follows:
> 
> > Fully ordered primitives are ordered against everything prior and
> > everything subsequent. Therefore a fully ordered primitive is like
> > having an smp_mb() before and an smp_mb() after the primitive.
> 
> And among those ops are the atomic_fetch_* ops. These are implemented as
> e.g. LDSETAL, with acquire-release semantics.
> 
> However, the *AL LSE ops have acquire semantics *for the read* and release
> semantics *for the write*. As independent components of the same atomic op,
> I cannot find anything in the ARM ARM that would imply ordering between the
> Load-Acquire and *prior* memory operations, nor ordering between the
> Store-Release and *subsequent* memory operations.
> 
> So it would seem these ops are not in fact fully ordered, but rather, only
> order the read component against prior ops, and the write component against
> subsequent ops.
> 
> Put another way: the current implementation means that unqualified ops are
> equal to _acquire + _release semantics as they are described in
> atomic_t.txt, but that is weaker than "fully ordered".
> 
> Throwing this litmus test at herd7 seems to confirm this theory:
> 
> AArch64 lse-atomic-al-ops-are-not-fully-ordered
> ""
> {
> 0:X1=x; 0:X3=y;
> 1:X1=x; 1:X3=y;
> }
>  P0                   | P1                  ;
>  MOV X0, #1           | MOV X0, #1          ;
>  LDSETAL X0, X2, [X1] | LDSETAL X0, X2, [X3];
>  LDR X4, [X3]         | LDR X4, [X1]        ;
> exists (0:X4=0 /\ 1:X4=0)
> 
> The positive result goes away adding a DMB ISH (i.e. smp_mb()) after the
> atomic ops, which contradicts the atomic_t.txt claim.
> 
> Did I miss something, or is this in fact an issue?

Both. The -AL atomics are actually special-cased in the
"barrier-ordered-before" relation in the Arm ARM:

  [RW1 is barrier-ordered-before if]
  * RW1 appears in program order before an atomic instruction with both
    Acquire and Release semantics that appears in program order before
    RW2.

However, that isn't sufficient to order prior accesses with the "load part"
of the RmW and later accesses with the "store part" of the RmW, as you have
observed in your test. I'm aware of some pending proposals in this area of
the architecture, so I'm reluctant to make any changes until that's
bottomed-out, but I'll make a note to chase that up.

> (And while I'm talking to the right people: this issue aside, do atomic ops
> on Normal memory create ordering with Device memory ops, or are there no
> guarantees there due to the fact that Normal memory is mapped
> inner-shareable and the ordering guarantees thus do not extend to
> outer-shareable Device accesses? My currenty understanding is the latter,
> but I find the ARM ARM wording hard to conclusively grok here.)

Outer-shareable is a superset of inner-shareable, but I think this would be
easier with a specific example. I'll go and look at the AIC patch, since
this is all a lot easier to talk about in the context of some real code.

Which is the latest version I should look at?

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-03-03 23:02 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-03 13:05 LSE atomic op ordering is weaker than intended? Hector Martin
2021-03-03 15:36 ` Will Deacon [this message]
2021-03-03 18:04   ` Hector Martin
2021-03-03 18:40     ` Will Deacon
2021-03-03 19:37       ` Hector Martin
2021-03-03 21:38         ` Will Deacon
2021-03-04  8:16           ` Hector Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210303153619.GA19247@willie-the-truck \
    --to=will@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=marcan@marcan.st \
    --cc=mark.rutland@arm.com \
    --cc=maz@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).