All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.ibm.com>
To: Will Deacon <will@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Joel Fernandes <joel@joelfernandes.org>,
	Alan Stern <stern@rowland.harvard.edu>,
	rostedt <rostedt@goodmis.org>,
	Valentin Schneider <valentin.schneider@arm.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Boqun Feng <boqun.feng@gmail.com>,
	Will Deacon <will.deacon@arm.com>,
	David Howells <dhowells@redhat.com>
Subject: Re: [PATCH 1/1] Fix: trace sched switch start/stop racy updates
Date: Wed, 21 Aug 2019 06:23:10 -0700	[thread overview]
Message-ID: <20190821132310.GC28441@linux.ibm.com> (raw)
In-Reply-To: <20190821103200.kpufwtviqhpbuv2n@willie-the-truck>

On Wed, Aug 21, 2019 at 11:32:01AM +0100, Will Deacon wrote:
> On Tue, Aug 20, 2019 at 01:29:32PM -0700, Paul E. McKenney wrote:
> > On Tue, Aug 20, 2019 at 03:56:12PM +0200, Peter Zijlstra wrote:
> > > On Sat, Aug 17, 2019 at 01:08:02AM -0700, Linus Torvalds wrote:
> > > 
> > > > The data tearing issue is almost a non-issue. We're not going to add
> > > > WRITE_ONCE() to these kinds of places for no good reason.
> > > 
> > > Paulmck actually has an example of that somewhere; ISTR that particular
> > > case actually got fixed by GCC, but I'd really _love_ for some compiler
> > > people (both GCC and LLVM) to state that their respective compilers will
> > > not do load/store tearing for machine word sized load/stores.
> > 
> > I do very much recall such an example, but I am now unable to either
> > find it or reproduce it.  :-/
> > 
> > If I cannot turn it up in a few days, I will ask the LWN editors to
> > make appropriate changes to the "Who is afraid" article.
> > 
> > > Without this written guarantee (which supposedly was in older GCC
> > > manuals but has since gone missing), I'm loathe to rely on it.
> > > 
> > > Yes, it is very rare, but it is a massive royal pain to debug if/when it
> > > does do happen.
> > 
> > But from what I can see, Linus is OK with use of WRITE_ONCE() for data
> > races on any variable for which there is at least one READ_ONCE().
> > So we can still use WRITE_ONCE() as we would like in our own code.
> > Yes, you or I might be hit by someone else's omission of WRITE_ONCE(),
> > it is better than the proverbial kick in the teeth.
> > 
> > Of course, if anyone knows of a compiler/architecture combination that
> > really does tear stores of 32-bit constants, please do not keep it
> > a secret!  After all, it would be good to get that addressed easily
> > starting now rather than after a difficult and painful series of
> > debugging sessions.
> 
> It's not quite what you asked for, but if you look at the following
> silly code:
> 
> typedef unsigned long long u64;
> 
> struct data {
> 	u64 arr[1023];
> 	u64 flag;
> };
> 
> void foo(struct data *x)
> {
> 	int i;
> 
> 	for (i = 0; i < 1023; ++i)
> 		x->arr[i] = 0;
> 
> 	x->flag = 0;
> }
> 
> void bar(u64 *x)
> {
> 	*x = 0xabcdef10abcdef10;
> }
> 
> Then arm64 clang (-O2) generates the following for foo:
> 
> foo:                                    // @foo
> 	stp	x29, x30, [sp, #-16]!   // 16-byte Folded Spill
> 	orr	w2, wzr, #0x2000
> 	mov	w1, wzr
> 	mov	x29, sp
> 	bl	memset
> 	ldp	x29, x30, [sp], #16     // 16-byte Folded Reload
> 	ret
> 
> and so the store to 'flag' has become part of the memset, which could
> easily be bytewise in terms of atomicity (and this isn't unlikely given
> we have a DC ZVA instruction which only guaratees bytewise atomicity).
> 
> GCC (also -O2) generates the following for bar:
> 
> bar:
> 	mov	w1, 61200
> 	movk	w1, 0xabcd, lsl 16
> 	stp	w1, w1, [x0]
> 	ret
> 
> and so it is using a store-pair instruction to reduce the complexity in
> the immediate generation. Thus, the 64-bit store will only have 32-bit
> atomicity. In fact, this is scary because if I change bar to:
> 
> void bar(u64 *x)
> {
> 	*(volatile u64 *)x = 0xabcdef10abcdef10;
> }
> 
> then I get:
> 
> bar:
> 	mov	w1, 61200
> 	movk	w1, 0xabcd, lsl 16
> 	str	w1, [x0]
> 	str	w1, [x0, 4]
> 	ret
> 
> so I'm not sure that WRITE_ONCE would even help :/

Well, I can have the LWN article cite your email, then.  So thank you
very much!

Is generation of this code for a 64-bit volatile store considered a bug?
Or does ARMv8 exclude the possibility of 64-bit MMIO registers?  And I
would guess that Thomas and Linus would ask a similar bugginess question
for normal stores.  ;-)

> It's worth noting that:
> 
> void baz(atomic_long *x)
> {
> 	atomic_store_explicit(x, 0xabcdef10abcdef10, memory_order_relaxed)
> }
> 
> does the right thing:
> 
> baz:
> 	mov	x1, 61200
> 	movk	x1, 0xabcd, lsl 16
> 	movk	x1, 0xef10, lsl 32
> 	movk	x1, 0xabcd, lsl 48
> 	str	x1, [x0]
> 	ret

OK, the C11 and C++11 guys should be happy with this.

> Whilst these examples may be contrived, I do thing they illustrate that
> we can't simply say "stores to aligned, word-sized pointers are atomic".

And thank you again!

							Thanx, Paul


  reply	other threads:[~2019-08-21 13:23 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-18 10:29 WARNING in tracepoint_probe_register_prio (3) syzbot
2019-08-16  0:11 ` syzbot
2019-08-16 14:26   ` [PATCH 1/1] Fix: trace sched switch start/stop racy updates Mathieu Desnoyers
2019-08-16 16:25     ` Steven Rostedt
2019-08-16 16:48       ` Valentin Schneider
2019-08-16 17:04         ` Steven Rostedt
2019-08-16 17:41           ` Mathieu Desnoyers
2019-08-16 19:18             ` Steven Rostedt
2019-08-16 19:19             ` Alan Stern
2019-08-16 20:44               ` Joel Fernandes
2019-08-16 20:49                 ` Thomas Gleixner
2019-08-16 20:57                   ` Joel Fernandes
2019-08-16 22:27                     ` Valentin Schneider
2019-08-16 22:57                       ` Linus Torvalds
2019-08-17  1:41                         ` Mathieu Desnoyers
2019-08-17  4:52                         ` Paul E. McKenney
2019-08-17  8:28                           ` Linus Torvalds
2019-08-17  8:44                             ` Linus Torvalds
2019-08-17 15:02                               ` Mathieu Desnoyers
2019-08-17 20:03                                 ` Valentin Schneider
2019-08-17 23:00                                   ` Paul E. McKenney
2019-08-19 10:34                                     ` Valentin Schneider
2019-08-17 22:28                             ` Paul E. McKenney
2019-08-20 14:01                           ` Peter Zijlstra
2019-08-20 20:31                             ` Paul E. McKenney
2019-08-20 20:39                               ` Peter Zijlstra
2019-08-20 20:52                                 ` Paul E. McKenney
2019-08-16 21:04                   ` Linus Torvalds
2019-08-17  1:36                     ` Mathieu Desnoyers
2019-08-17  2:13                       ` Steven Rostedt
2019-08-17 14:40                         ` Mathieu Desnoyers
2019-08-17 15:26                           ` Steven Rostedt
2019-08-17 15:55                             ` Mathieu Desnoyers
2019-08-17 16:40                               ` Steven Rostedt
2019-08-17 22:06                                 ` Paul E. McKenney
2019-08-17  8:08                       ` Linus Torvalds
2019-08-20 13:56                         ` Peter Zijlstra
2019-08-20 20:29                           ` Paul E. McKenney
2019-08-21 10:32                             ` Will Deacon
2019-08-21 13:23                               ` Paul E. McKenney [this message]
2019-08-21 13:32                                 ` Will Deacon
2019-08-21 13:56                                   ` Paul E. McKenney
2019-08-21 16:22                                     ` Will Deacon
2019-08-21 15:33                                 ` Peter Zijlstra
2019-08-21 15:48                                   ` Mathieu Desnoyers
2019-08-21 16:14                                     ` Paul E. McKenney
2019-08-21 19:03                                     ` Joel Fernandes
2019-09-09  6:21                           ` Herbert Xu
2019-08-16 20:49                 ` Steven Rostedt
2019-08-16 20:59                   ` Joel Fernandes
2019-08-17  1:25                   ` Mathieu Desnoyers
2019-08-18  9:15                   ` stable markup was " Pavel Machek
2019-08-16 17:19       ` Mathieu Desnoyers
2019-08-16 19:15         ` Steven Rostedt
2019-08-17 14:27           ` Mathieu Desnoyers
2019-08-17 15:42             ` Steven Rostedt
2019-08-17 15:53               ` Mathieu Desnoyers
2019-08-17 16:43                 ` Steven Rostedt
2019-08-16 12:32 ` WARNING in tracepoint_probe_register_prio (3) syzbot
2019-08-16 12:41   ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190821132310.GC28441@linux.ibm.com \
    --to=paulmck@linux.ibm.com \
    --cc=boqun.feng@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=stern@rowland.harvard.edu \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=valentin.schneider@arm.com \
    --cc=will.deacon@arm.com \
    --cc=will@kernel.org \
    --subject='Re: [PATCH 1/1] Fix: trace sched switch start/stop racy updates' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.