All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: rostedt <rostedt@goodmis.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	"Joel Fernandes, Google" <joel@joelfernandes.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	paulmck <paulmck@linux.ibm.com>,
	Boqun Feng <boqun.feng@gmail.com>,
	Will Deacon <will.deacon@arm.com>,
	David Howells <dhowells@redhat.com>,
	Alan Stern <stern@rowland.harvard.edu>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH 1/1] Fix: trace sched switch start/stop racy updates
Date: Fri, 16 Aug 2019 13:41:59 -0400 (EDT)	[thread overview]
Message-ID: <241506096.21688.1565977319832.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <20190816130440.07cc0a30@oasis.local.home>

----- On Aug 16, 2019, at 1:04 PM, rostedt rostedt@goodmis.org wrote:

> On Fri, 16 Aug 2019 17:48:59 +0100
> Valentin Schneider <valentin.schneider@arm.com> wrote:
> 
>> On 16/08/2019 17:25, Steven Rostedt wrote:
>> >> Also, write and read to/from those variables should be done with
>> >> WRITE_ONCE() and READ_ONCE(), given that those are read within tracing
>> >> probes without holding the sched_register_mutex.
>> >>  
>> > 
>> > I understand the READ_ONCE() but is the WRITE_ONCE() truly necessary?
>> > It's done while holding the mutex. It's not that critical of a path,
>> > and makes the code look ugly.
>> >   
>> 
>> I seem to recall something like locking primitives don't protect you from
>> store tearing / invented stores, so if you can have concurrent readers
>> using READ_ONCE(), there should be a WRITE_ONCE() on the writer side, even
>> if it's done in a critical section.
> 
> But for this, it really doesn't matter. The READ_ONCE() is for going
> from 0->1 or 1->0 any other change is the same as 1.

Let's consider this "invented store" scenario (even though as I noted in my
earlier email, I suspect this particular instance of the code does not appear
to fit the requirements to generate this in the wild with current compilers):

intial state:

sched_tgid_ref = 10;

Thread A                                Thread B

registration                            tracepoint probe
sched_tgid_ref++
  - compiler decides to invent a
    store: sched_tgid_ref = 0
                                        READ_ONCE(sched_tgid_ref) -> observes 0,
                                        but should really see either 10 or 11.
  - compiler stores 11 into
    sched_tgid_ref

This kind of scenario could cause spurious missing data in the middle of a
trace caused by another user trying to increment the reference count, which
is really unexpected.

A similar scenario can happen for "store tearing" if the compiler decides
to break the store into many stores close to the 16-bit overflow value when
updating a 32-bit reference count. Spurious 1 -> 0 -> 1 transitions could be
observed by readers.

> When we enable trace events, we start recording the tasks comms such
> that we can possibly map them to the pids. When we disable trace
> events, we stop recording the comms so that we don't overwrite the
> cache when not needed. Note, if more than the max cache of tasks are
> recorded during a session, we are likely to miss comms anyway.
> 
> Thinking about this more, the READ_ONCE() and WRTIE_ONCE() are not even
> needed, because this is just a best effort anyway.

If you choose not to use READ_ONCE(), then the "load tearing" issue can
cause similar spurious 1 -> 0 -> 1 transitions near 16-bit counter
overflow as described above. The "Invented load" also becomes an issue,
because the compiler could use the loaded value for a branch, and re-load
that value between two branches which are expected to use the same value,
effectively generating a corrupted state.

I think we need a statement about whether READ_ONCE/WRITE_ONCE should
be used in this kind of situation, or if we are fine dealing with the
awkward compiler side-effects when they will occur.

Thanks,

Mathieu

> 
> The only real fix was to move the check into the mutex protect area,
> because that can cause a real bug if there was a race.
> 
> {
> -	bool sched_register = (!sched_cmdline_ref && !sched_tgid_ref);
> +	bool sched_register;
> +
> 	mutex_lock(&sched_register_mutex);
> +	sched_register = (!sched_cmdline_ref && !sched_tgid_ref);
> 
> Thus, I'd like to see a v2 of this patch without the READ_ONCE() or
> WRITE_ONCE() added.
> 
> -- Steve

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2019-08-16 17:42 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-18 10:29 WARNING in tracepoint_probe_register_prio (3) syzbot
2019-08-16  0:11 ` syzbot
2019-08-16 14:26   ` [PATCH 1/1] Fix: trace sched switch start/stop racy updates Mathieu Desnoyers
2019-08-16 16:25     ` Steven Rostedt
2019-08-16 16:48       ` Valentin Schneider
2019-08-16 17:04         ` Steven Rostedt
2019-08-16 17:41           ` Mathieu Desnoyers [this message]
2019-08-16 19:18             ` Steven Rostedt
2019-08-16 19:19             ` Alan Stern
2019-08-16 20:44               ` Joel Fernandes
2019-08-16 20:49                 ` Thomas Gleixner
2019-08-16 20:57                   ` Joel Fernandes
2019-08-16 22:27                     ` Valentin Schneider
2019-08-16 22:57                       ` Linus Torvalds
2019-08-17  1:41                         ` Mathieu Desnoyers
2019-08-17  4:52                         ` Paul E. McKenney
2019-08-17  8:28                           ` Linus Torvalds
2019-08-17  8:44                             ` Linus Torvalds
2019-08-17 15:02                               ` Mathieu Desnoyers
2019-08-17 20:03                                 ` Valentin Schneider
2019-08-17 23:00                                   ` Paul E. McKenney
2019-08-19 10:34                                     ` Valentin Schneider
2019-08-17 22:28                             ` Paul E. McKenney
2019-08-20 14:01                           ` Peter Zijlstra
2019-08-20 20:31                             ` Paul E. McKenney
2019-08-20 20:39                               ` Peter Zijlstra
2019-08-20 20:52                                 ` Paul E. McKenney
2019-08-16 21:04                   ` Linus Torvalds
2019-08-17  1:36                     ` Mathieu Desnoyers
2019-08-17  2:13                       ` Steven Rostedt
2019-08-17 14:40                         ` Mathieu Desnoyers
2019-08-17 15:26                           ` Steven Rostedt
2019-08-17 15:55                             ` Mathieu Desnoyers
2019-08-17 16:40                               ` Steven Rostedt
2019-08-17 22:06                                 ` Paul E. McKenney
2019-08-17  8:08                       ` Linus Torvalds
2019-08-20 13:56                         ` Peter Zijlstra
2019-08-20 20:29                           ` Paul E. McKenney
2019-08-21 10:32                             ` Will Deacon
2019-08-21 13:23                               ` Paul E. McKenney
2019-08-21 13:32                                 ` Will Deacon
2019-08-21 13:56                                   ` Paul E. McKenney
2019-08-21 16:22                                     ` Will Deacon
2019-08-21 15:33                                 ` Peter Zijlstra
2019-08-21 15:48                                   ` Mathieu Desnoyers
2019-08-21 16:14                                     ` Paul E. McKenney
2019-08-21 19:03                                     ` Joel Fernandes
2019-09-09  6:21                           ` Herbert Xu
2019-08-16 20:49                 ` Steven Rostedt
2019-08-16 20:59                   ` Joel Fernandes
2019-08-17  1:25                   ` Mathieu Desnoyers
2019-08-18  9:15                   ` stable markup was " Pavel Machek
2019-08-16 17:19       ` Mathieu Desnoyers
2019-08-16 19:15         ` Steven Rostedt
2019-08-17 14:27           ` Mathieu Desnoyers
2019-08-17 15:42             ` Steven Rostedt
2019-08-17 15:53               ` Mathieu Desnoyers
2019-08-17 16:43                 ` Steven Rostedt
2019-08-16 12:32 ` WARNING in tracepoint_probe_register_prio (3) syzbot
2019-08-16 12:41   ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=241506096.21688.1565977319832.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=boqun.feng@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=stern@rowland.harvard.edu \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=valentin.schneider@arm.com \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.