linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Marco Elver <elver@google.com>, linux-kernel@vger.kernel.org
Subject: Re: RCU vs data_race()
Date: Mon, 21 Jun 2021 06:37:57 -0700	[thread overview]
Message-ID: <20210621133757.GS4397@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <YNA/gkHbq46A/21C@hirez.programming.kicks-ass.net>

On Mon, Jun 21, 2021 at 09:28:02AM +0200, Peter Zijlstra wrote:
> On Sun, Jun 20, 2021 at 02:01:27PM -0700, Paul E. McKenney wrote:
> > On Sun, Jun 20, 2021 at 09:14:28PM +0200, Peter Zijlstra wrote:
> 
> > > I don't buy that argument. pr_err() (or worse) is not supposed to
> > > happen, ever. If it does, *that* is a far worse condition that any data
> > > race possibly found by kcsan.
> > > 
> > > So the only way the pr_err() expression itself can lead to kcsan
> > > determining a data-race, if something far worse triggered the pr_err()
> > > itself.
> > 
> > Earlier, you said pr_warn().  Above, I said pr_*().  Now you say
> > pr_err().  But OK...
> 
> Same, thing.. also Sundays aren't great for details it seems :-)

I know that feeling!  ;-)

> > Let's take for example the pr_err() in __call_rcu(), that is, the
> > double-free diagnostic.  A KCSAN warning on the unmarked load from
> > head->func could give valuable information on the whereabouts of the
> > other code interfering with the callback.  Blanket disabling of KCSAN
> > across all pr_err() calls (let alone all pr_*() calls) would be the
> > opposite of helpful.
> 
> I'm confused. That pr_err() should never happen in a correct program. If
> it happens, fix it and any data race as a consequence of that pr_err()
> no longer exists either.
> 
> I fundementally don't see the relevance of a possible data race from a
> statement that should never happen in a correct program to begin with.
> 
> Why do you think otherwise?

Because detection of that data race can provide valuable debugging help.

> > > You've lost me on the schedule thing, what?
> > 
> > The definition of schedule_timeout_interruptible() is in part of the
> > kernel that uses much looser KCSAN checking.  Thus there are some
> > KCSAN warnings from RCU involving schedule_timeout_interruptible().
> > But at least some of these warnings are for conflicting writes, which
> > many parts of the kernel suppress KCSAN warnings for.
> 
> You've lost me again.. schedule_timeout_interruptible() doesn't do
> writes to rcu state, does it? So how can there be problems?

Because the static inline functions end up in RCU's tranlation units,
and they do write to other state.  See for example the line marked with
the asterisk at the end of this email, which is apparently between
__run_timers() and rcu_torture_reader().  But gdb says that this is
really between this statement in __run_timers():

	base->running_timer = NULL;

And this statement in try_to_del_timer_sync():

	if (base->running_timer != timer)

Because of inline functions, running KCSAN on RCU can detect races in
other subsystems.  In this case, I could argue for a READ_ONCE() on the
"if" condition, but last I knew, the rules for timers are that C-language
writes do not participate in data races.  So maybe I should add that
READ_ONCE(), but running KCSAN on rcutorture would still complain.

Here is the stack leading to try_to_del_timer_sync():

	read to 0xffff95bbdf49c280 of 8 bytes by task 167 on cpu 0:
	 try_to_del_timer_sync+0x96/0x1b0
	 del_timer_sync+0xa8/0xe0
	 schedule_timeout+0xc9/0x140
	 schedule_timeout_interruptible+0x28/0x30
	 stutter_wait+0x18a/0x220
	 rcu_torture_reader+0xfc/0x330
	 kthread+0x1fa/0x210
	 ret_from_fork+0x22/0x30

So one option would be to add the READ_ONCE(), but put an RCU wrapper
around schedule_timeout_interruptible() to allow rcutorture to either
show all data races or only those directly related to RCU.  The reason for
wanting it to show only those directly related to RCU is that this allows
me to much more easily spot a new RCU-related data race.  The reason
for wanting it to (only sometimes!) show all data races is to help track
down any non-RCU bugs that might be afficting my rcutorture runs.

Make sense?

							Thanx, Paul

------------------------------------------------------------------------

    129 BUG: KCSAN: data-race in prandom_bytes / prandom_u32
    123 BUG: KCSAN: data-race in __hrtimer_run_queues / hrtimer_active
     56 BUG: KCSAN: data-race in mutex_spin_on_owner+0x13d/0x260
     51 BUG: KCSAN: data-race in prandom_u32 / prandom_u32
     41 BUG: KCSAN: data-race in __run_timers / try_to_del_timer_sync
     37 BUG: KCSAN: data-race in tick_sched_timer / tick_sched_timer
     34 BUG: KCSAN: data-race in tick_nohz_idle_stop_tick / tick_nohz_next_event
     26 BUG: KCSAN: data-race in tick_nohz_idle_stop_tick / tick_nohz_idle_stop_tick
*    18 BUG: KCSAN: data-race in __run_timers / rcu_torture_reader
     16 BUG: KCSAN: data-race in __hrtimer_run_queues / hrtimer_try_to_cancel
     16 BUG: KCSAN: data-race in tick_nohz_idle_stop_tick / tick_sched_timer
      6 BUG: KCSAN: data-race in cpumask_next+0x20/0x50
      6 BUG: KCSAN: data-race in __mod_timer / __run_timers
      6 BUG: KCSAN: data-race in run_timer_softirq / try_to_del_timer_sync
      6 BUG: KCSAN: data-race in rwsem_spin_on_owner+0x144/0x230
      6 BUG: KCSAN: data-race in tick_nohz_next_event / tick_sched_timer
      5 BUG: KCSAN: data-race in cpumask_next_and+0x25/0x60
      4 BUG: KCSAN: data-race in _find_next_bit+0x3f/0x120
      4 BUG: KCSAN: data-race in ps2_do_sendbyte / ps2_handle_ack
      4 BUG: KCSAN: data-race in rcu_preempt_deferred_qs_irqrestore / try_to_take_rt_mutex
      4 BUG: KCSAN: data-race in rcu_torture_reader / run_timer_softirq
      3 BUG: KCSAN: data-race in cpumask_next_and+0x27/0x70
      3 BUG: KCSAN: data-race in cpumask_next_and+0x30/0x60
      3 BUG: KCSAN: data-race in cpumask_next_wrap+0x4b/0xa0
      3 BUG: KCSAN: data-race in do_raw_write_lock / do_raw_write_unlock
      3 BUG: KCSAN: data-race in process_one_work / queue_work_on
      3 BUG: KCSAN: data-race in timer_clear_idle / trigger_dyntick_cpu
      2 BUG: KCSAN: data-race in do_raw_write_lock / do_raw_write_lock
      2 BUG: KCSAN: data-race in ktime_get_real_fast_ns+0x7d/0x130
      2 BUG: KCSAN: data-race in ktime_get_real_fast_ns+0xb5/0x130
      2 BUG: KCSAN: data-race in __mod_timer / run_timer_softirq
      2 BUG: KCSAN: data-race in queue_work_on / worker_thread
      1 BUG: KCSAN: data-race in bringup_cpu / cpuhp_should_run
      1 BUG: KCSAN: data-race in can_stop_idle_tick / tick_sched_timer
      1 BUG: KCSAN: data-race in internal_add_timer / update_process_times
      1 BUG: KCSAN: data-race in rcu_nocb_cb_kthread / rcu_nocb_gp_kthread
      1 BUG: KCSAN: data-race in rcu_preempt_deferred_qs_irqrestore / rt_mutex_unlock
      1 BUG: KCSAN: data-race in __ww_mutex_lock+0x1a3/0x1610

  reply	other threads:[~2021-06-21 13:37 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-18  8:24 RCU vs data_race() Peter Zijlstra
2021-06-18  8:59 ` Marco Elver
2021-06-18 11:26   ` Peter Zijlstra
2021-06-18 20:48     ` Paul E. McKenney
2021-06-20 19:14       ` Peter Zijlstra
2021-06-20 21:01         ` Paul E. McKenney
2021-06-21  7:28           ` Peter Zijlstra
2021-06-21 13:37             ` Paul E. McKenney [this message]
2021-07-06  8:00               ` Peter Zijlstra
2021-07-06  8:44                 ` Marco Elver
2021-07-06 10:33                   ` Peter Zijlstra
2021-07-06 14:03                     ` Paul E. McKenney
2021-07-06  8:37               ` Peter Zijlstra
2021-07-06 14:16                 ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210621133757.GS4397@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=elver@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).