linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Valentin Schneider <valentin.schneider@arm.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	"Joel Fernandes, Google" <joel@joelfernandes.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	paulmck <paulmck@linux.ibm.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Will Deacon <will.deacon@arm.com>,
	David Howells <dhowells@redhat.com>,
	Alan Stern <stern@rowland.harvard.edu>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH 1/1] Fix: trace sched switch start/stop racy updates
Date: Fri, 16 Aug 2019 15:18:46 -0400	[thread overview]
Message-ID: <20190816151846.54692edc@oasis.local.home> (raw)
In-Reply-To: <241506096.21688.1565977319832.JavaMail.zimbra@efficios.com>

On Fri, 16 Aug 2019 13:41:59 -0400 (EDT)
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> ----- On Aug 16, 2019, at 1:04 PM, rostedt rostedt@goodmis.org wrote:
> 
> > On Fri, 16 Aug 2019 17:48:59 +0100
> > Valentin Schneider <valentin.schneider@arm.com> wrote:
> >   
> >> On 16/08/2019 17:25, Steven Rostedt wrote:  
> >> >> Also, write and read to/from those variables should be done with
> >> >> WRITE_ONCE() and READ_ONCE(), given that those are read within tracing
> >> >> probes without holding the sched_register_mutex.
> >> >>    
> >> > 
> >> > I understand the READ_ONCE() but is the WRITE_ONCE() truly necessary?
> >> > It's done while holding the mutex. It's not that critical of a path,
> >> > and makes the code look ugly.
> >> >     
> >> 
> >> I seem to recall something like locking primitives don't protect you from
> >> store tearing / invented stores, so if you can have concurrent readers
> >> using READ_ONCE(), there should be a WRITE_ONCE() on the writer side, even
> >> if it's done in a critical section.  
> > 
> > But for this, it really doesn't matter. The READ_ONCE() is for going
> > from 0->1 or 1->0 any other change is the same as 1.  
> 
> Let's consider this "invented store" scenario (even though as I noted in my
> earlier email, I suspect this particular instance of the code does not appear
> to fit the requirements to generate this in the wild with current compilers):
> 
> intial state:
> 
> sched_tgid_ref = 10;
> 
> Thread A                                Thread B
> 
> registration                            tracepoint probe
> sched_tgid_ref++
>   - compiler decides to invent a
>     store: sched_tgid_ref = 0

This looks to me that this would cause more issues in other parts of
the code if a compiler ever decided to do this.

But I still don't care for this case. It's a cache, nothing more. No
guarantee that anything will be recorded.


>                                         READ_ONCE(sched_tgid_ref) ->
> observes 0, but should really see either 10 or 11.
>   - compiler stores 11 into
>     sched_tgid_ref
> 
> This kind of scenario could cause spurious missing data in the middle
> of a trace caused by another user trying to increment the reference
> count, which is really unexpected.
> 
> A similar scenario can happen for "store tearing" if the compiler
> decides to break the store into many stores close to the 16-bit
> overflow value when updating a 32-bit reference count. Spurious 1 ->
> 0 -> 1 transitions could be observed by readers.
> 
> > When we enable trace events, we start recording the tasks comms such
> > that we can possibly map them to the pids. When we disable trace
> > events, we stop recording the comms so that we don't overwrite the
> > cache when not needed. Note, if more than the max cache of tasks are
> > recorded during a session, we are likely to miss comms anyway.
> > 
> > Thinking about this more, the READ_ONCE() and WRTIE_ONCE() are not
> > even needed, because this is just a best effort anyway.  
> 
> If you choose not to use READ_ONCE(), then the "load tearing" issue
> can cause similar spurious 1 -> 0 -> 1 transitions near 16-bit counter
> overflow as described above. The "Invented load" also becomes an
> issue, because the compiler could use the loaded value for a branch,
> and re-load that value between two branches which are expected to use
> the same value, effectively generating a corrupted state.
> 
> I think we need a statement about whether READ_ONCE/WRITE_ONCE should
> be used in this kind of situation, or if we are fine dealing with the
> awkward compiler side-effects when they will occur.
> 

Let me put it this way. My biggest issue with this, is that the
READ/WRITE_ONCE() has *nothing* to do with the bug you are trying to
fix.

That bug is that we did the decision of starting the probes outside the
mutex. That is fixed my moving the decision into the mutex. The
READ/WRITE_ONCE() is just added noise.

-- Steve

  reply	other threads:[~2019-08-16 19:18 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-18 10:29 WARNING in tracepoint_probe_register_prio (3) syzbot
2019-08-16  0:11 ` syzbot
2019-08-16 14:26   ` [PATCH 1/1] Fix: trace sched switch start/stop racy updates Mathieu Desnoyers
2019-08-16 16:25     ` Steven Rostedt
2019-08-16 16:48       ` Valentin Schneider
2019-08-16 17:04         ` Steven Rostedt
2019-08-16 17:41           ` Mathieu Desnoyers
2019-08-16 19:18             ` Steven Rostedt [this message]
2019-08-16 19:19             ` Alan Stern
2019-08-16 20:44               ` Joel Fernandes
2019-08-16 20:49                 ` Thomas Gleixner
2019-08-16 20:57                   ` Joel Fernandes
2019-08-16 22:27                     ` Valentin Schneider
2019-08-16 22:57                       ` Linus Torvalds
2019-08-17  1:41                         ` Mathieu Desnoyers
2019-08-17  4:52                         ` Paul E. McKenney
2019-08-17  8:28                           ` Linus Torvalds
2019-08-17  8:44                             ` Linus Torvalds
2019-08-17 15:02                               ` Mathieu Desnoyers
2019-08-17 20:03                                 ` Valentin Schneider
2019-08-17 23:00                                   ` Paul E. McKenney
2019-08-19 10:34                                     ` Valentin Schneider
2019-08-17 22:28                             ` Paul E. McKenney
2019-08-20 14:01                           ` Peter Zijlstra
2019-08-20 20:31                             ` Paul E. McKenney
2019-08-20 20:39                               ` Peter Zijlstra
2019-08-20 20:52                                 ` Paul E. McKenney
2019-08-16 21:04                   ` Linus Torvalds
2019-08-17  1:36                     ` Mathieu Desnoyers
2019-08-17  2:13                       ` Steven Rostedt
2019-08-17 14:40                         ` Mathieu Desnoyers
2019-08-17 15:26                           ` Steven Rostedt
2019-08-17 15:55                             ` Mathieu Desnoyers
2019-08-17 16:40                               ` Steven Rostedt
2019-08-17 22:06                                 ` Paul E. McKenney
2019-08-17  8:08                       ` Linus Torvalds
2019-08-20 13:56                         ` Peter Zijlstra
2019-08-20 20:29                           ` Paul E. McKenney
2019-08-21 10:32                             ` Will Deacon
2019-08-21 13:23                               ` Paul E. McKenney
2019-08-21 13:32                                 ` Will Deacon
2019-08-21 13:56                                   ` Paul E. McKenney
2019-08-21 16:22                                     ` Will Deacon
2019-08-21 15:33                                 ` Peter Zijlstra
2019-08-21 15:48                                   ` Mathieu Desnoyers
2019-08-21 16:14                                     ` Paul E. McKenney
2019-08-21 19:03                                     ` Joel Fernandes
2019-09-09  6:21                           ` Herbert Xu
2019-08-16 20:49                 ` Steven Rostedt
2019-08-16 20:59                   ` Joel Fernandes
2019-08-17  1:25                   ` Mathieu Desnoyers
2019-08-18  9:15                   ` stable markup was " Pavel Machek
2019-08-16 17:19       ` Mathieu Desnoyers
2019-08-16 19:15         ` Steven Rostedt
2019-08-17 14:27           ` Mathieu Desnoyers
2019-08-17 15:42             ` Steven Rostedt
2019-08-17 15:53               ` Mathieu Desnoyers
2019-08-17 16:43                 ` Steven Rostedt
2019-08-16 12:32 ` WARNING in tracepoint_probe_register_prio (3) syzbot
2019-08-16 12:41   ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190816151846.54692edc@oasis.local.home \
    --to=rostedt@goodmis.org \
    --cc=boqun.feng@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=stern@rowland.harvard.edu \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=valentin.schneider@arm.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).