linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>, linke li <lilinke99@qq.com>,
	joel@joelfernandes.org, boqun.feng@gmail.com, dave@stgolabs.net,
	frederic@kernel.org, jiangshanlai@gmail.com,
	josh@joshtriplett.org, linux-kernel@vger.kernel.org,
	mathieu.desnoyers@efficios.com, qiang.zhang1211@gmail.com,
	quic_neeraju@quicinc.com, rcu@vger.kernel.org
Subject: Re: [PATCH] rcutorture: Fix rcu_torture_pipe_update_one()/rcu_torture_writer() data race and concurrency bug
Date: Wed, 6 Mar 2024 18:29:43 -0800	[thread overview]
Message-ID: <83b47424-e5e0-46de-aa63-d413a5aa6cec@paulmck-laptop> (raw)
In-Reply-To: <CAHk-=wgPAZ4KnCQergqAOUypwinYh=gZ0q4EQbwvuUcJ_8UK+Q@mail.gmail.com>

On Wed, Mar 06, 2024 at 11:46:10AM -0800, Linus Torvalds wrote:
> On Wed, 6 Mar 2024 at 11:27, Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > Note this has nothing to do with tracing. This thread is in RCU. I just
> > happen to receive the same patch "fix" for my code.
> 
> Ok, googling for rtort_pipe_count, I can only state that that code is
> complete garbage.

Well, you all do have to admit that I was right about something, namely
that Linus did not suffer in silence.  ;-)

TL;DR:  Those ->rtort_pipe_count increments cannot run concurrently
with each other or any other update of that field, so that update-side
READ_ONCE() call is unnecessary and the update-side plain C-language
read is OK.  The WRITE_ONCE() calls are there for the benefit of the
lockless read-side accesses to rtort_pipe_count.

Of course, I will fix.

> And no amount of READ_ONCE/WRITE_ONCE will fix it.
> 
> For one thing, we have this code:
> 
>         WRITE_ONCE(rp->rtort_pipe_count, i + 1);
>         if (rp->rtort_pipe_count >= RCU_TORTURE_PIPE_LEN) {
> 
> which is broken by design. The compiler is allowed to (and probably
> does) turn that into just
> 
>         WRITE_ONCE(rp->rtort_pipe_count, i + 1);
>         if (i + 1 >= RCU_TORTURE_PIPE_LEN) {
> 
> which only results in the question "Why didn't the source code do that
> obvious simplification itself?"

Oddly enough, we got a patch to this effect just yesterday evening,
Pacific Time:

https://lore.kernel.org/all/tencent_5D8919B7D1F21906868DE81406015360270A@qq.com/

But that of course only fixes one aspect of this mess.

> So that code is actively *STUPID*. It's randomly mixing READ_ONCE and
> regular reads in ways that just makes me go: "there's no saving this
> shit".
> 
> This needs fixing. Having tests that have random code in them only
> makes me doubt that the *TEST* itself is correct, rather than the code
> it is trying to actually test.

Huh.  There should only be one CPU updating ->rtort_pipe_count at at time.
Which to your point would make that READ_ONCE() call not only unnecessary,
but confusing as well.  Here is the intended uses of ->rtort_pipe_count:

1.	Zeroed when the item is first allocated.

2.	Incremented at the end of each subsequent grace period,
	so there is no concurrency with previous instances of
	this increment and also none with the zeroing.

	This happens in two different ways.  When testing things like
	call_rcu(), the RCU callback does the update, and if we have not
	yet reached the limit, passes that same callback to call_rcu().
	When testing other grace-period primitives (synchronize_rcu(),
	for example), we put the item on a list, and update each element
	of that list after each subsequent synchronous grace period.
	Once an element has pased through RCU_TORTURE_PIPE_LEN grace
	periods, it is removed from that list and freed.

	Either way, there is only ever one CPU incrementing a given
	->rtort_pipe_count at any given time.

3.	Freed, and not passed to another grace period, thus avoiding
	concurrency with a future #2.  Allocation and free is protected by
	a spinlock, so there is no concurrency with #1 above.  The same
	CPU that did the last increment does the free, so there is also
	no concurrency with the last #2.

I fired up a KCSAN run with added ASSERT_EXCLUSIVE_WRITER() calls, which
agrees with this analysis.  (And I will run longer runs to double-check.)

> And dammit, none of that makes sense anyway. This is not some
> performance-crticial code. Why is it not using proper atomics if there
> is an actual data race?
> 
> The reason to use READ_ONCE() and WRITE_ONCE() is that they can be a
> lot faster than atomics, or - more commonly - because you have some
> fundamental algorithm that doesn't do arithmetic, but cares about some
> "state at time X" (the RCU _pointer_ being one such obvious case, but
> doing an *increment* sure as hell isn't).

The only data race is between the one-at-a-time increment and the
lockless reads in the RCU readers.  So the RCU readers need READ_ONCE()
for ->rtort_pipe_count, but again the updater does not.

Which means that the compiler optimization that Linus pointed out above
really is an optimization for once because the compiler is for once
correct that nothing else is updating ->rtort_pipe_count.

> So using those READ_ONCE/WRITE_ONCE macros for that thing is
> fundamntally wrong to begin with.
> 
> The question should not be "should we add another READ_ONCE()". The
> question should be "what drugs were people on when writing this code"?

So what (if anything) was I thinking when I committed those update-side
READ_ONCE() calls?

202489101f2e6c ("rcutorture: Fix rcu_torture_one_read()/rcu_torture_writer() data race")

The commit log says the right thing, but I nevertheless added that
unnecessary READ_ONCE() call.  And here I was hoping to be able to blame
someone else!  ;-)

> People - please just stop writing garbage.
> 
> That 'rtort_pipe_count' should be an atomic_t, and the "add one and
> return the old value" should be an "atomic_inc_return()-1" (the "-1"
> is because "inc_return" returns the *new* value).
> 
> And feel free to add "_relaxed()" to that atomic op because this code
> doesn't care about ordering of that counter. It will help on some
> architectures, but as mentioned, this is not performance-crticial code
> to begin with.

There is no update-side concurrency, so there is no need for atomics.
I just need to get rid of that extraneous update-side READ_ONCE() call.

Athough I am not sure that I can credibly promise to never ever write
garbage, I most certainly can fix this particular instance.  :-/

							Thanx, Paul

  parent reply	other threads:[~2024-03-07  2:29 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-04 10:54 [PATCH] rcutorture: Fix rcu_torture_pipe_update_one()/rcu_torture_writer() data race and concurrency bug linke li
2024-03-04 16:19 ` Joel Fernandes
2024-03-04 17:14   ` Paul E. McKenney
2024-03-04 19:10     ` Joel Fernandes
2024-03-04 19:44       ` Paul E. McKenney
2024-03-04 20:13         ` Joel Fernandes
2024-03-04 20:47           ` Paul E. McKenney
2024-03-05  3:30             ` linke
2024-03-05  6:24   ` linke li
2024-03-06 15:37     ` Steven Rostedt
2024-03-06 17:36       ` Paul E. McKenney
2024-03-06 18:01         ` Steven Rostedt
2024-03-06 18:09           ` Paul E. McKenney
2024-03-06 18:20             ` Steven Rostedt
2024-03-06 18:43           ` Linus Torvalds
2024-03-06 18:55             ` Steven Rostedt
2024-03-06 19:01               ` Linus Torvalds
2024-03-06 19:27                 ` Linus Torvalds
2024-03-06 19:47                   ` Steven Rostedt
2024-03-06 20:06                     ` Linus Torvalds
2024-03-07 13:20                       ` Steven Rostedt
2024-03-07 16:12                         ` Steven Rostedt
2024-03-06 19:27                 ` Steven Rostedt
2024-03-06 19:46                   ` Linus Torvalds
2024-03-06 20:20                     ` Linus Torvalds
2024-03-07  2:29                     ` Paul E. McKenney [this message]
2024-03-07  2:43                       ` Linus Torvalds
2024-03-07  2:49                         ` Linus Torvalds
2024-03-07  3:21                           ` Paul E. McKenney
2024-03-07  3:06                         ` Paul E. McKenney
2024-03-07  3:06                         ` Mathieu Desnoyers
2024-03-07  3:37                           ` Paul E. McKenney
2024-03-07  5:44                             ` Joel Fernandes
2024-03-07 19:05                               ` Paul E. McKenney
2024-03-07 13:53                             ` Mathieu Desnoyers
2024-03-07 19:47                               ` Paul E. McKenney
2024-03-07 19:53                                 ` Mathieu Desnoyers
2024-03-08  0:58                                   ` Paul E. McKenney
2024-03-07 20:00                                 ` Linus Torvalds
2024-03-07 20:57                                   ` Paul E. McKenney
2024-03-07 21:40                                     ` Julia Lawall
2024-03-07 22:09                                       ` Linus Torvalds
2024-03-08  0:55                                         ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83b47424-e5e0-46de-aa63-d413a5aa6cec@paulmck-laptop \
    --to=paulmck@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=dave@stgolabs.net \
    --cc=frederic@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=lilinke99@qq.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=qiang.zhang1211@gmail.com \
    --cc=quic_neeraju@quicinc.com \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).