archive mirror
 help / color / mirror / Atom feed
From: Hector Martin <>
To: Will Deacon <>,
	Linux ARM <>
Cc: Marc Zyngier <>,
	Mark Rutland <>,
	Peter Zijlstra <>,
	Boqun Feng <>,
	Catalin Marinas <>
Subject: LSE atomic op ordering is weaker than intended?
Date: Wed, 3 Mar 2021 22:05:19 +0900	[thread overview]
Message-ID: <> (raw)

Hi Will and everyone else,

While yak shaving the AIC driver ordering minutiae, I came across this.

atomic_t.txt describes "fully ordered" atomic ops as follows:

 > Fully ordered primitives are ordered against everything prior and
 > everything subsequent. Therefore a fully ordered primitive is like
 > having an smp_mb() before and an smp_mb() after the primitive.

And among those ops are the atomic_fetch_* ops. These are implemented as 
e.g. LDSETAL, with acquire-release semantics.

However, the *AL LSE ops have acquire semantics *for the read* and 
release semantics *for the write*. As independent components of the same 
atomic op, I cannot find anything in the ARM ARM that would imply 
ordering between the Load-Acquire and *prior* memory operations, nor 
ordering between the Store-Release and *subsequent* memory operations.

So it would seem these ops are not in fact fully ordered, but rather, 
only order the read component against prior ops, and the write component 
against subsequent ops.

Put another way: the current implementation means that unqualified ops 
are equal to _acquire + _release semantics as they are described in 
atomic_t.txt, but that is weaker than "fully ordered".

Throwing this litmus test at herd7 seems to confirm this theory:

AArch64 lse-atomic-al-ops-are-not-fully-ordered
0:X1=x; 0:X3=y;
1:X1=x; 1:X3=y;
  P0                   | P1                  ;
  MOV X0, #1           | MOV X0, #1          ;
  LDSETAL X0, X2, [X1] | LDSETAL X0, X2, [X3];
  LDR X4, [X3]         | LDR X4, [X1]        ;
exists (0:X4=0 /\ 1:X4=0)

The positive result goes away adding a DMB ISH (i.e. smp_mb()) after the 
atomic ops, which contradicts the atomic_t.txt claim.

Did I miss something, or is this in fact an issue?

(And while I'm talking to the right people: this issue aside, do atomic 
ops on Normal memory create ordering with Device memory ops, or are 
there no guarantees there due to the fact that Normal memory is mapped 
inner-shareable and the ordering guarantees thus do not extend to 
outer-shareable Device accesses? My currenty understanding is the 
latter, but I find the ARM ARM wording hard to conclusively grok here.)

Hector Martin (
Public Key:

linux-arm-kernel mailing list

             reply	other threads:[~2021-03-03 22:44 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-03 13:05 Hector Martin [this message]
2021-03-03 15:36 ` LSE atomic op ordering is weaker than intended? Will Deacon
2021-03-03 18:04   ` Hector Martin
2021-03-03 18:40     ` Will Deacon
2021-03-03 19:37       ` Hector Martin
2021-03-03 21:38         ` Will Deacon
2021-03-04  8:16           ` Hector Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).