* [PATCH tip/core/rcu 0/3] Tasks-RCU updates for v5.7 @ 2020-02-15 0:24 Paul E. McKenney 2020-02-15 0:25 ` [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head paulmck ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Paul E. McKenney @ 2020-02-15 0:24 UTC (permalink / raw) To: rcu Cc: linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel Hello! This series contains Tasks-RCU updates. 1. Add *_ONCE() for rcu_tasks_cbs_head. 2. Add missing annotation for exit_tasks_rcu_start(), courtesy of Jules Irenge. 3. Add missing annotation for exit_tasks_rcu_finish(), courtesy of Jules Irenge. Thanx, Paul ------------------------------------------------------------------------ update.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-15 0:24 [PATCH tip/core/rcu 0/3] Tasks-RCU updates for v5.7 Paul E. McKenney @ 2020-02-15 0:25 ` paulmck 2020-02-17 12:38 ` Peter Zijlstra 2020-02-15 0:25 ` [PATCH tip/core/rcu 2/3] rcu: Add missing annotation for exit_tasks_rcu_start() paulmck 2020-02-15 0:25 ` [PATCH tip/core/rcu 3/3] rcu: Add missing annotation for exit_tasks_rcu_finish() paulmck 2 siblings, 1 reply; 23+ messages in thread From: paulmck @ 2020-02-15 0:25 UTC (permalink / raw) To: rcu Cc: linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel, Paul E. McKenney From: "Paul E. McKenney" <paulmck@kernel.org> The RCU tasks list of callbacks, rcu_tasks_cbs_head, is sampled locklessly by rcu_tasks_kthread() when waiting for work to do. This commit therefore applies READ_ONCE() to that lockless sampling and WRITE_ONCE() to the single potential store outside of rcu_tasks_kthread. This data race was reported by KCSAN. Not appropriate for backporting due to failure being unlikely. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> --- kernel/rcu/update.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index 6c4b862..a27df76 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -528,7 +528,7 @@ void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func) rhp->func = func; raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags); needwake = !rcu_tasks_cbs_head; - *rcu_tasks_cbs_tail = rhp; + WRITE_ONCE(*rcu_tasks_cbs_tail, rhp); rcu_tasks_cbs_tail = &rhp->next; raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags); /* We can't create the thread unless interrupts are enabled. */ @@ -658,7 +658,7 @@ static int __noreturn rcu_tasks_kthread(void *arg) /* If there were none, wait a bit and start over. */ if (!list) { wait_event_interruptible(rcu_tasks_cbs_wq, - rcu_tasks_cbs_head); + READ_ONCE(rcu_tasks_cbs_head)); if (!rcu_tasks_cbs_head) { WARN_ON(signal_pending(current)); schedule_timeout_interruptible(HZ/10); -- 2.9.5 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-15 0:25 ` [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head paulmck @ 2020-02-17 12:38 ` Peter Zijlstra 2020-02-17 18:16 ` Paul E. McKenney 2020-02-17 18:23 ` Joel Fernandes 0 siblings, 2 replies; 23+ messages in thread From: Peter Zijlstra @ 2020-02-17 12:38 UTC (permalink / raw) To: paulmck Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet, fweisbec, oleg, joel On Fri, Feb 14, 2020 at 04:25:18PM -0800, paulmck@kernel.org wrote: > From: "Paul E. McKenney" <paulmck@kernel.org> > > The RCU tasks list of callbacks, rcu_tasks_cbs_head, is sampled locklessly > by rcu_tasks_kthread() when waiting for work to do. This commit therefore > applies READ_ONCE() to that lockless sampling and WRITE_ONCE() to the > single potential store outside of rcu_tasks_kthread. > > This data race was reported by KCSAN. Not appropriate for backporting > due to failure being unlikely. What failure is possible here? AFAICT this is (again) one of them load-complare-against-constant-discard patterns that are impossible to mess up. > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > --- > kernel/rcu/update.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c > index 6c4b862..a27df76 100644 > --- a/kernel/rcu/update.c > +++ b/kernel/rcu/update.c > @@ -528,7 +528,7 @@ void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func) > rhp->func = func; > raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags); > needwake = !rcu_tasks_cbs_head; > - *rcu_tasks_cbs_tail = rhp; > + WRITE_ONCE(*rcu_tasks_cbs_tail, rhp); > rcu_tasks_cbs_tail = &rhp->next; > raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags); > /* We can't create the thread unless interrupts are enabled. */ > @@ -658,7 +658,7 @@ static int __noreturn rcu_tasks_kthread(void *arg) > /* If there were none, wait a bit and start over. */ > if (!list) { > wait_event_interruptible(rcu_tasks_cbs_wq, > - rcu_tasks_cbs_head); > + READ_ONCE(rcu_tasks_cbs_head)); > if (!rcu_tasks_cbs_head) { > WARN_ON(signal_pending(current)); > schedule_timeout_interruptible(HZ/10); > -- > 2.9.5 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-17 12:38 ` Peter Zijlstra @ 2020-02-17 18:16 ` Paul E. McKenney 2020-02-18 7:56 ` Peter Zijlstra 2020-02-17 18:23 ` Joel Fernandes 1 sibling, 1 reply; 23+ messages in thread From: Paul E. McKenney @ 2020-02-17 18:16 UTC (permalink / raw) To: Peter Zijlstra Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet, fweisbec, oleg, joel On Mon, Feb 17, 2020 at 01:38:51PM +0100, Peter Zijlstra wrote: > On Fri, Feb 14, 2020 at 04:25:18PM -0800, paulmck@kernel.org wrote: > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > The RCU tasks list of callbacks, rcu_tasks_cbs_head, is sampled locklessly > > by rcu_tasks_kthread() when waiting for work to do. This commit therefore > > applies READ_ONCE() to that lockless sampling and WRITE_ONCE() to the > > single potential store outside of rcu_tasks_kthread. > > > > This data race was reported by KCSAN. Not appropriate for backporting > > due to failure being unlikely. > > What failure is possible here? AFAICT this is (again) one of them > load-complare-against-constant-discard patterns that are impossible to > mess up. First, please keep in mind that this is RCU code. Rather uncomplicated for RCU, to be sure, but still RCU code. The failure modes are thus as follows: o I produce a patch for which KCSAN gives a legitimate warning, but this warning is obscured by a pile of other warnings. Yes, we should continue improving KCSAN's ability to adapt to the users desired compiler-optimization risk level, but in RCU's case that risk level is set quite low. In RCU, what others are calling false positives are therefore addressed. Yes, this does cost me a bit of work, but it is trivial compared to the work required to track down a real bug. o Someone optimizes or otherwise changes the wait/wakeup code, which inadvertently gives the compiler more scope for mischief. In short, within RCU, I am handling all KCSAN complaints. This is looking to be an extremely inexpensive insurance policy for RCU. Other subsystems are of course free to make their own tradeoffs, and subsystems having less-aggressive concurrency control might be well-advised to take a different path than the one I am taking. Thanx, Paul > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > --- > > kernel/rcu/update.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c > > index 6c4b862..a27df76 100644 > > --- a/kernel/rcu/update.c > > +++ b/kernel/rcu/update.c > > @@ -528,7 +528,7 @@ void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func) > > rhp->func = func; > > raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags); > > needwake = !rcu_tasks_cbs_head; > > - *rcu_tasks_cbs_tail = rhp; > > + WRITE_ONCE(*rcu_tasks_cbs_tail, rhp); > > rcu_tasks_cbs_tail = &rhp->next; > > raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags); > > /* We can't create the thread unless interrupts are enabled. */ > > @@ -658,7 +658,7 @@ static int __noreturn rcu_tasks_kthread(void *arg) > > /* If there were none, wait a bit and start over. */ > > if (!list) { > > wait_event_interruptible(rcu_tasks_cbs_wq, > > - rcu_tasks_cbs_head); > > + READ_ONCE(rcu_tasks_cbs_head)); > > if (!rcu_tasks_cbs_head) { > > WARN_ON(signal_pending(current)); > > schedule_timeout_interruptible(HZ/10); > > -- > > 2.9.5 > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-17 18:16 ` Paul E. McKenney @ 2020-02-18 7:56 ` Peter Zijlstra 2020-02-18 16:27 ` Paul E. McKenney 0 siblings, 1 reply; 23+ messages in thread From: Peter Zijlstra @ 2020-02-18 7:56 UTC (permalink / raw) To: Paul E. McKenney Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet, fweisbec, oleg, joel On Mon, Feb 17, 2020 at 10:16:16AM -0800, Paul E. McKenney wrote: > On Mon, Feb 17, 2020 at 01:38:51PM +0100, Peter Zijlstra wrote: > > On Fri, Feb 14, 2020 at 04:25:18PM -0800, paulmck@kernel.org wrote: > > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > > > The RCU tasks list of callbacks, rcu_tasks_cbs_head, is sampled locklessly > > > by rcu_tasks_kthread() when waiting for work to do. This commit therefore > > > applies READ_ONCE() to that lockless sampling and WRITE_ONCE() to the > > > single potential store outside of rcu_tasks_kthread. > > > > > > This data race was reported by KCSAN. Not appropriate for backporting > > > due to failure being unlikely. > > > > What failure is possible here? AFAICT this is (again) one of them > > load-complare-against-constant-discard patterns that are impossible to > > mess up. > > First, please keep in mind that this is RCU code. Rather uncomplicated > for RCU, to be sure, but still RCU code. > > The failure modes are thus as follows: > > o I produce a patch for which KCSAN gives a legitimate warning, > but this warning is obscured by a pile of other warnings. > Yes, we should continue improving KCSAN's ability to adapt > to the users desired compiler-optimization risk level, but > in RCU's case that risk level is set quite low. > > In RCU, what others are calling false positives are therefore > addressed. Yes, this does cost me a bit of work, but it is > trivial compared to the work required to track down a real bug. > > o Someone optimizes or otherwise changes the wait/wakeup code, > which inadvertently gives the compiler more scope for mischief. > > In short, within RCU, I am handling all KCSAN complaints. This is looking > to be an extremely inexpensive insurance policy for RCU. Other subsystems > are of course free to make their own tradeoffs, and subsystems having > less-aggressive concurrency control might be well-advised to take a > different path than the one I am taking. I just took offence at the Changelog wording. It seems to suggest there actually is a problem, there is not. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-18 7:56 ` Peter Zijlstra @ 2020-02-18 16:27 ` Paul E. McKenney 2020-02-18 20:11 ` Peter Zijlstra 0 siblings, 1 reply; 23+ messages in thread From: Paul E. McKenney @ 2020-02-18 16:27 UTC (permalink / raw) To: Peter Zijlstra Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet, fweisbec, oleg, joel On Tue, Feb 18, 2020 at 08:56:48AM +0100, Peter Zijlstra wrote: > On Mon, Feb 17, 2020 at 10:16:16AM -0800, Paul E. McKenney wrote: > > On Mon, Feb 17, 2020 at 01:38:51PM +0100, Peter Zijlstra wrote: > > > On Fri, Feb 14, 2020 at 04:25:18PM -0800, paulmck@kernel.org wrote: > > > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > > > > > The RCU tasks list of callbacks, rcu_tasks_cbs_head, is sampled locklessly > > > > by rcu_tasks_kthread() when waiting for work to do. This commit therefore > > > > applies READ_ONCE() to that lockless sampling and WRITE_ONCE() to the > > > > single potential store outside of rcu_tasks_kthread. > > > > > > > > This data race was reported by KCSAN. Not appropriate for backporting > > > > due to failure being unlikely. > > > > > > What failure is possible here? AFAICT this is (again) one of them > > > load-complare-against-constant-discard patterns that are impossible to > > > mess up. > > > > First, please keep in mind that this is RCU code. Rather uncomplicated > > for RCU, to be sure, but still RCU code. > > > > The failure modes are thus as follows: > > > > o I produce a patch for which KCSAN gives a legitimate warning, > > but this warning is obscured by a pile of other warnings. > > Yes, we should continue improving KCSAN's ability to adapt > > to the users desired compiler-optimization risk level, but > > in RCU's case that risk level is set quite low. > > > > In RCU, what others are calling false positives are therefore > > addressed. Yes, this does cost me a bit of work, but it is > > trivial compared to the work required to track down a real bug. > > > > o Someone optimizes or otherwise changes the wait/wakeup code, > > which inadvertently gives the compiler more scope for mischief. > > > > In short, within RCU, I am handling all KCSAN complaints. This is looking > > to be an extremely inexpensive insurance policy for RCU. Other subsystems > > are of course free to make their own tradeoffs, and subsystems having > > less-aggressive concurrency control might be well-advised to take a > > different path than the one I am taking. > > I just took offence at the Changelog wording. It seems to suggest there > actually is a problem, there is not. Quoting the changelog: "Not appropriate for backporting due to failure being unlikely." Good enough? Thanx, Paul ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-18 16:27 ` Paul E. McKenney @ 2020-02-18 20:11 ` Peter Zijlstra 2020-02-18 20:22 ` Paul E. McKenney 0 siblings, 1 reply; 23+ messages in thread From: Peter Zijlstra @ 2020-02-18 20:11 UTC (permalink / raw) To: Paul E. McKenney Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet, fweisbec, oleg, joel On Tue, Feb 18, 2020 at 08:27:19AM -0800, Paul E. McKenney wrote: > On Tue, Feb 18, 2020 at 08:56:48AM +0100, Peter Zijlstra wrote: > > I just took offence at the Changelog wording. It seems to suggest there > > actually is a problem, there is not. > > Quoting the changelog: "Not appropriate for backporting due to failure > being unlikely." That implies there is failure, however unlikely. In this particular case there is absolutely no failure, except perhaps in KCSAN. This patch is a pure annotation such that KCSAN can understand the code. Like said, I don't object to the actual patch, but I do think it is important to call out false negatives or to describe the actual problem found. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-18 20:11 ` Peter Zijlstra @ 2020-02-18 20:22 ` Paul E. McKenney 2020-02-18 22:45 ` Steven Rostedt 0 siblings, 1 reply; 23+ messages in thread From: Paul E. McKenney @ 2020-02-18 20:22 UTC (permalink / raw) To: Peter Zijlstra Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet, fweisbec, oleg, joel On Tue, Feb 18, 2020 at 09:11:42PM +0100, Peter Zijlstra wrote: > On Tue, Feb 18, 2020 at 08:27:19AM -0800, Paul E. McKenney wrote: > > On Tue, Feb 18, 2020 at 08:56:48AM +0100, Peter Zijlstra wrote: > > > > I just took offence at the Changelog wording. It seems to suggest there > > > actually is a problem, there is not. > > > > Quoting the changelog: "Not appropriate for backporting due to failure > > being unlikely." > > That implies there is failure, however unlikely. > > In this particular case there is absolutely no failure, except perhaps > in KCSAN. This patch is a pure annotation such that KCSAN can understand > the code. > > Like said, I don't object to the actual patch, but I do think it is > important to call out false negatives or to describe the actual problem > found. I don't feel at all comfortable declaring that there is absolutely no possibility of failure. Thanx, Paul ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-18 20:22 ` Paul E. McKenney @ 2020-02-18 22:45 ` Steven Rostedt 2020-02-18 22:54 ` Paul E. McKenney 0 siblings, 1 reply; 23+ messages in thread From: Steven Rostedt @ 2020-02-18 22:45 UTC (permalink / raw) To: Paul E. McKenney Cc: Peter Zijlstra, rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, dhowells, edumazet, fweisbec, oleg, joel On Tue, 18 Feb 2020 12:22:26 -0800 "Paul E. McKenney" <paulmck@kernel.org> wrote: > On Tue, Feb 18, 2020 at 09:11:42PM +0100, Peter Zijlstra wrote: > > On Tue, Feb 18, 2020 at 08:27:19AM -0800, Paul E. McKenney wrote: > > > On Tue, Feb 18, 2020 at 08:56:48AM +0100, Peter Zijlstra wrote: > > > > > > I just took offence at the Changelog wording. It seems to suggest there > > > > actually is a problem, there is not. > > > > > > Quoting the changelog: "Not appropriate for backporting due to failure > > > being unlikely." > > > > That implies there is failure, however unlikely. > > > > In this particular case there is absolutely no failure, except perhaps > > in KCSAN. This patch is a pure annotation such that KCSAN can understand > > the code. > > > > Like said, I don't object to the actual patch, but I do think it is > > important to call out false negatives or to describe the actual problem > > found. > > I don't feel at all comfortable declaring that there is absolutely > no possibility of failure. Perhaps wording it like so: "There's know known issue with the current code, but the *_ONCE() annotations here makes KCSAN happy, allowing us to focus on KCSAN warnings that can help bring about known issues in other code that we can fix, without being distracted by KCSAN warnings that we do not see a problem with." ? -- Steve ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-18 22:45 ` Steven Rostedt @ 2020-02-18 22:54 ` Paul E. McKenney 2020-02-18 23:13 ` Steven Rostedt 2020-02-19 0:01 ` Joel Fernandes 0 siblings, 2 replies; 23+ messages in thread From: Paul E. McKenney @ 2020-02-18 22:54 UTC (permalink / raw) To: Steven Rostedt Cc: Peter Zijlstra, rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, dhowells, edumazet, fweisbec, oleg, joel On Tue, Feb 18, 2020 at 05:45:03PM -0500, Steven Rostedt wrote: > On Tue, 18 Feb 2020 12:22:26 -0800 > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > On Tue, Feb 18, 2020 at 09:11:42PM +0100, Peter Zijlstra wrote: > > > On Tue, Feb 18, 2020 at 08:27:19AM -0800, Paul E. McKenney wrote: > > > > On Tue, Feb 18, 2020 at 08:56:48AM +0100, Peter Zijlstra wrote: > > > > > > > > I just took offence at the Changelog wording. It seems to suggest there > > > > > actually is a problem, there is not. > > > > > > > > Quoting the changelog: "Not appropriate for backporting due to failure > > > > being unlikely." > > > > > > That implies there is failure, however unlikely. > > > > > > In this particular case there is absolutely no failure, except perhaps > > > in KCSAN. This patch is a pure annotation such that KCSAN can understand > > > the code. > > > > > > Like said, I don't object to the actual patch, but I do think it is > > > important to call out false negatives or to describe the actual problem > > > found. > > > > I don't feel at all comfortable declaring that there is absolutely > > no possibility of failure. > > Perhaps wording it like so: > > "There's know known issue with the current code, but the *_ONCE() > annotations here makes KCSAN happy, allowing us to focus on KCSAN > warnings that can help bring about known issues in other code that we > can fix, without being distracted by KCSAN warnings that we do not see > a problem with." > > ? That sounds more like something I might put in rcutodo.html as a statement of the RCU approach to KCSAN reports. But switching to a different situation (for variety, if nothing else), what about the commit shown below? Thanx, Paul ------------------------------------------------------------------------ commit 35bc02b04a041f32470ae6d959c549bcce8483db Author: Paul E. McKenney <paulmck@kernel.org> Date: Tue Feb 18 13:41:02 2020 -0800 rcutorture: Mark data-race potential for rcu_barrier() test statistics The n_barrier_successes, n_barrier_attempts, and n_rcu_torture_barrier_error variables are updated (without access markings) by the main rcu_barrier() test kthread, and accessed (also without access markings) by the rcu_torture_stats() kthread. This of course can result in KCSAN complaints. Because the accesses are in diagnostic prints, this commit uses data_race() to excuse the diagnostic prints from the data race. If this were to ever cause bogus statistics prints (for example, due to store tearing), any misleading information would be disambiguated by the presence or absence of an rcutorture splat. This data race was reported by KCSAN. Not appropriate for backporting due to failure being unlikely and due to the mild consequences of the failure, namely a confusing rcutorture console message. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c index 5453bd5..b3301f3 100644 --- a/kernel/rcu/rcutorture.c +++ b/kernel/rcu/rcutorture.c @@ -1444,9 +1444,9 @@ rcu_torture_stats_print(void) atomic_long_read(&n_rcu_torture_timers)); torture_onoff_stats(); pr_cont("barrier: %ld/%ld:%ld\n", - n_barrier_successes, - n_barrier_attempts, - n_rcu_torture_barrier_error); + data_race(n_barrier_successes), + data_race(n_barrier_attempts), + data_race(n_rcu_torture_barrier_error)); pr_alert("%s%s ", torture_type, TORTURE_FLAG); if (atomic_read(&n_rcu_torture_mberror) || ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-18 22:54 ` Paul E. McKenney @ 2020-02-18 23:13 ` Steven Rostedt 2020-02-18 23:54 ` Paul E. McKenney 2020-02-19 0:01 ` Joel Fernandes 1 sibling, 1 reply; 23+ messages in thread From: Steven Rostedt @ 2020-02-18 23:13 UTC (permalink / raw) To: Paul E. McKenney Cc: Peter Zijlstra, rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, dhowells, edumazet, fweisbec, oleg, joel On Tue, 18 Feb 2020 14:54:55 -0800 "Paul E. McKenney" <paulmck@kernel.org> wrote: > This data race was reported by KCSAN. Not appropriate for backporting > due to failure being unlikely and due to the mild consequences of the > failure, namely a confusing rcutorture console message. > I've seen patches backported for less. :-/ Really, any statement that says something may go awry with the code, will be an argument to backport it. -- Steve ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-18 23:13 ` Steven Rostedt @ 2020-02-18 23:54 ` Paul E. McKenney 0 siblings, 0 replies; 23+ messages in thread From: Paul E. McKenney @ 2020-02-18 23:54 UTC (permalink / raw) To: Steven Rostedt Cc: Peter Zijlstra, rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, dhowells, edumazet, fweisbec, oleg, joel On Tue, Feb 18, 2020 at 06:13:23PM -0500, Steven Rostedt wrote: > On Tue, 18 Feb 2020 14:54:55 -0800 > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > This data race was reported by KCSAN. Not appropriate for backporting > > due to failure being unlikely and due to the mild consequences of the > > failure, namely a confusing rcutorture console message. > > > > I've seen patches backported for less. :-/ > > Really, any statement that says something may go awry with the code, > will be an argument to backport it. You aren't kidding! Rumor has it that someone tried backporting the RCU flavor-consolidation work, for but one example. Though I cannot help but salute the level of insanity represented by that attempt. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-18 22:54 ` Paul E. McKenney 2020-02-18 23:13 ` Steven Rostedt @ 2020-02-19 0:01 ` Joel Fernandes 2020-02-19 0:16 ` Paul E. McKenney 1 sibling, 1 reply; 23+ messages in thread From: Joel Fernandes @ 2020-02-19 0:01 UTC (permalink / raw) To: Paul E. McKenney Cc: Steven Rostedt, Peter Zijlstra, rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, dhowells, edumazet, fweisbec, oleg On Tue, Feb 18, 2020 at 02:54:55PM -0800, Paul E. McKenney wrote: > On Tue, Feb 18, 2020 at 05:45:03PM -0500, Steven Rostedt wrote: > > On Tue, 18 Feb 2020 12:22:26 -0800 > > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > > > On Tue, Feb 18, 2020 at 09:11:42PM +0100, Peter Zijlstra wrote: > > > > On Tue, Feb 18, 2020 at 08:27:19AM -0800, Paul E. McKenney wrote: > > > > > On Tue, Feb 18, 2020 at 08:56:48AM +0100, Peter Zijlstra wrote: > > > > > > > > > > I just took offence at the Changelog wording. It seems to suggest there > > > > > > actually is a problem, there is not. > > > > > > > > > > Quoting the changelog: "Not appropriate for backporting due to failure > > > > > being unlikely." > > > > > > > > That implies there is failure, however unlikely. > > > > > > > > In this particular case there is absolutely no failure, except perhaps > > > > in KCSAN. This patch is a pure annotation such that KCSAN can understand > > > > the code. > > > > > > > > Like said, I don't object to the actual patch, but I do think it is > > > > important to call out false negatives or to describe the actual problem > > > > found. > > > > > > I don't feel at all comfortable declaring that there is absolutely > > > no possibility of failure. > > > > Perhaps wording it like so: > > > > "There's know known issue with the current code, but the *_ONCE() > > annotations here makes KCSAN happy, allowing us to focus on KCSAN > > warnings that can help bring about known issues in other code that we > > can fix, without being distracted by KCSAN warnings that we do not see > > a problem with." > > > > ? > > That sounds more like something I might put in rcutodo.html as a statement > of the RCU approach to KCSAN reports. > > But switching to a different situation (for variety, if nothing else), > what about the commit shown below? > > Thanx, Paul > > ------------------------------------------------------------------------ > > commit 35bc02b04a041f32470ae6d959c549bcce8483db > Author: Paul E. McKenney <paulmck@kernel.org> > Date: Tue Feb 18 13:41:02 2020 -0800 > > rcutorture: Mark data-race potential for rcu_barrier() test statistics > > The n_barrier_successes, n_barrier_attempts, and > n_rcu_torture_barrier_error variables are updated (without access > markings) by the main rcu_barrier() test kthread, and accessed (also > without access markings) by the rcu_torture_stats() kthread. This of > course can result in KCSAN complaints. > > Because the accesses are in diagnostic prints, this commit uses > data_race() to excuse the diagnostic prints from the data race. If this > were to ever cause bogus statistics prints (for example, due to store > tearing), any misleading information would be disambiguated by the > presence or absence of an rcutorture splat. > > This data race was reported by KCSAN. Not appropriate for backporting > due to failure being unlikely and due to the mild consequences of the > failure, namely a confusing rcutorture console message. > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c > index 5453bd5..b3301f3 100644 > --- a/kernel/rcu/rcutorture.c > +++ b/kernel/rcu/rcutorture.c > @@ -1444,9 +1444,9 @@ rcu_torture_stats_print(void) > atomic_long_read(&n_rcu_torture_timers)); > torture_onoff_stats(); > pr_cont("barrier: %ld/%ld:%ld\n", > - n_barrier_successes, > - n_barrier_attempts, > - n_rcu_torture_barrier_error); > + data_race(n_barrier_successes), > + data_race(n_barrier_attempts), > + data_race(n_rcu_torture_barrier_error)); Would it be not worth just fixing the data-race within rcutorture itself? thanks, - Joel ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-19 0:01 ` Joel Fernandes @ 2020-02-19 0:16 ` Paul E. McKenney 2020-02-19 1:13 ` Joel Fernandes 0 siblings, 1 reply; 23+ messages in thread From: Paul E. McKenney @ 2020-02-19 0:16 UTC (permalink / raw) To: Joel Fernandes Cc: Steven Rostedt, Peter Zijlstra, rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, dhowells, edumazet, fweisbec, oleg On Tue, Feb 18, 2020 at 07:01:44PM -0500, Joel Fernandes wrote: > On Tue, Feb 18, 2020 at 02:54:55PM -0800, Paul E. McKenney wrote: > > On Tue, Feb 18, 2020 at 05:45:03PM -0500, Steven Rostedt wrote: > > > On Tue, 18 Feb 2020 12:22:26 -0800 > > > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > > > > > On Tue, Feb 18, 2020 at 09:11:42PM +0100, Peter Zijlstra wrote: > > > > > On Tue, Feb 18, 2020 at 08:27:19AM -0800, Paul E. McKenney wrote: > > > > > > On Tue, Feb 18, 2020 at 08:56:48AM +0100, Peter Zijlstra wrote: > > > > > > > > > > > > I just took offence at the Changelog wording. It seems to suggest there > > > > > > > actually is a problem, there is not. > > > > > > > > > > > > Quoting the changelog: "Not appropriate for backporting due to failure > > > > > > being unlikely." > > > > > > > > > > That implies there is failure, however unlikely. > > > > > > > > > > In this particular case there is absolutely no failure, except perhaps > > > > > in KCSAN. This patch is a pure annotation such that KCSAN can understand > > > > > the code. > > > > > > > > > > Like said, I don't object to the actual patch, but I do think it is > > > > > important to call out false negatives or to describe the actual problem > > > > > found. > > > > > > > > I don't feel at all comfortable declaring that there is absolutely > > > > no possibility of failure. > > > > > > Perhaps wording it like so: > > > > > > "There's know known issue with the current code, but the *_ONCE() > > > annotations here makes KCSAN happy, allowing us to focus on KCSAN > > > warnings that can help bring about known issues in other code that we > > > can fix, without being distracted by KCSAN warnings that we do not see > > > a problem with." > > > > > > ? > > > > That sounds more like something I might put in rcutodo.html as a statement > > of the RCU approach to KCSAN reports. > > > > But switching to a different situation (for variety, if nothing else), > > what about the commit shown below? > > > > Thanx, Paul > > > > ------------------------------------------------------------------------ > > > > commit 35bc02b04a041f32470ae6d959c549bcce8483db > > Author: Paul E. McKenney <paulmck@kernel.org> > > Date: Tue Feb 18 13:41:02 2020 -0800 > > > > rcutorture: Mark data-race potential for rcu_barrier() test statistics > > > > The n_barrier_successes, n_barrier_attempts, and > > n_rcu_torture_barrier_error variables are updated (without access > > markings) by the main rcu_barrier() test kthread, and accessed (also > > without access markings) by the rcu_torture_stats() kthread. This of > > course can result in KCSAN complaints. > > > > Because the accesses are in diagnostic prints, this commit uses > > data_race() to excuse the diagnostic prints from the data race. If this > > were to ever cause bogus statistics prints (for example, due to store > > tearing), any misleading information would be disambiguated by the > > presence or absence of an rcutorture splat. > > > > This data race was reported by KCSAN. Not appropriate for backporting > > due to failure being unlikely and due to the mild consequences of the > > failure, namely a confusing rcutorture console message. > > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > > > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c > > index 5453bd5..b3301f3 100644 > > --- a/kernel/rcu/rcutorture.c > > +++ b/kernel/rcu/rcutorture.c > > @@ -1444,9 +1444,9 @@ rcu_torture_stats_print(void) > > atomic_long_read(&n_rcu_torture_timers)); > > torture_onoff_stats(); > > pr_cont("barrier: %ld/%ld:%ld\n", > > - n_barrier_successes, > > - n_barrier_attempts, > > - n_rcu_torture_barrier_error); > > + data_race(n_barrier_successes), > > + data_race(n_barrier_attempts), > > + data_race(n_rcu_torture_barrier_error)); > > Would it be not worth just fixing the data-race within rcutorture itself? I could use WRITE_ONCE() for updates and READ_ONCE() for statistics. However, my current rule is that diagnostic code that is not participating in the core synchronization uses data_race(). That way, if I do a typo and write to (say) n_barrier_attempts in some other thread, KCSAN will know to yell at me. Thanx, Paul ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-19 0:16 ` Paul E. McKenney @ 2020-02-19 1:13 ` Joel Fernandes 2020-02-19 1:48 ` Paul E. McKenney 0 siblings, 1 reply; 23+ messages in thread From: Joel Fernandes @ 2020-02-19 1:13 UTC (permalink / raw) To: Paul E. McKenney Cc: Steven Rostedt, Peter Zijlstra, rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, dhowells, edumazet, fweisbec, oleg On Tue, Feb 18, 2020 at 04:16:40PM -0800, Paul E. McKenney wrote: > On Tue, Feb 18, 2020 at 07:01:44PM -0500, Joel Fernandes wrote: > > On Tue, Feb 18, 2020 at 02:54:55PM -0800, Paul E. McKenney wrote: > > > On Tue, Feb 18, 2020 at 05:45:03PM -0500, Steven Rostedt wrote: > > > > On Tue, 18 Feb 2020 12:22:26 -0800 > > > > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > > > > > > > On Tue, Feb 18, 2020 at 09:11:42PM +0100, Peter Zijlstra wrote: > > > > > > On Tue, Feb 18, 2020 at 08:27:19AM -0800, Paul E. McKenney wrote: > > > > > > > On Tue, Feb 18, 2020 at 08:56:48AM +0100, Peter Zijlstra wrote: > > > > > > > > > > > > > > I just took offence at the Changelog wording. It seems to suggest there > > > > > > > > actually is a problem, there is not. > > > > > > > > > > > > > > Quoting the changelog: "Not appropriate for backporting due to failure > > > > > > > being unlikely." > > > > > > > > > > > > That implies there is failure, however unlikely. > > > > > > > > > > > > In this particular case there is absolutely no failure, except perhaps > > > > > > in KCSAN. This patch is a pure annotation such that KCSAN can understand > > > > > > the code. > > > > > > > > > > > > Like said, I don't object to the actual patch, but I do think it is > > > > > > important to call out false negatives or to describe the actual problem > > > > > > found. > > > > > > > > > > I don't feel at all comfortable declaring that there is absolutely > > > > > no possibility of failure. > > > > > > > > Perhaps wording it like so: > > > > > > > > "There's know known issue with the current code, but the *_ONCE() > > > > annotations here makes KCSAN happy, allowing us to focus on KCSAN > > > > warnings that can help bring about known issues in other code that we > > > > can fix, without being distracted by KCSAN warnings that we do not see > > > > a problem with." > > > > > > > > ? > > > > > > That sounds more like something I might put in rcutodo.html as a statement > > > of the RCU approach to KCSAN reports. > > > > > > But switching to a different situation (for variety, if nothing else), > > > what about the commit shown below? > > > > > > Thanx, Paul > > > > > > ------------------------------------------------------------------------ > > > > > > commit 35bc02b04a041f32470ae6d959c549bcce8483db > > > Author: Paul E. McKenney <paulmck@kernel.org> > > > Date: Tue Feb 18 13:41:02 2020 -0800 > > > > > > rcutorture: Mark data-race potential for rcu_barrier() test statistics > > > > > > The n_barrier_successes, n_barrier_attempts, and > > > n_rcu_torture_barrier_error variables are updated (without access > > > markings) by the main rcu_barrier() test kthread, and accessed (also > > > without access markings) by the rcu_torture_stats() kthread. This of > > > course can result in KCSAN complaints. > > > > > > Because the accesses are in diagnostic prints, this commit uses > > > data_race() to excuse the diagnostic prints from the data race. If this > > > were to ever cause bogus statistics prints (for example, due to store > > > tearing), any misleading information would be disambiguated by the > > > presence or absence of an rcutorture splat. > > > > > > This data race was reported by KCSAN. Not appropriate for backporting > > > due to failure being unlikely and due to the mild consequences of the > > > failure, namely a confusing rcutorture console message. > > > > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > > > > > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c > > > index 5453bd5..b3301f3 100644 > > > --- a/kernel/rcu/rcutorture.c > > > +++ b/kernel/rcu/rcutorture.c > > > @@ -1444,9 +1444,9 @@ rcu_torture_stats_print(void) > > > atomic_long_read(&n_rcu_torture_timers)); > > > torture_onoff_stats(); > > > pr_cont("barrier: %ld/%ld:%ld\n", > > > - n_barrier_successes, > > > - n_barrier_attempts, > > > - n_rcu_torture_barrier_error); > > > + data_race(n_barrier_successes), > > > + data_race(n_barrier_attempts), > > > + data_race(n_rcu_torture_barrier_error)); > > > > Would it be not worth just fixing the data-race within rcutorture itself? > > I could use WRITE_ONCE() for updates and READ_ONCE() for statistics. > However, my current rule is that diagnostic code that is not participating > in the core synchronization uses data_race(). That way, if I do a typo > and write to (say) n_barrier_attempts in some other thread, KCSAN will > know to yell at me. Oh, ok. That makes sense. Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> thanks, - Joel ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-19 1:13 ` Joel Fernandes @ 2020-02-19 1:48 ` Paul E. McKenney 0 siblings, 0 replies; 23+ messages in thread From: Paul E. McKenney @ 2020-02-19 1:48 UTC (permalink / raw) To: Joel Fernandes Cc: Steven Rostedt, Peter Zijlstra, rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, dhowells, edumazet, fweisbec, oleg On Tue, Feb 18, 2020 at 08:13:59PM -0500, Joel Fernandes wrote: > On Tue, Feb 18, 2020 at 04:16:40PM -0800, Paul E. McKenney wrote: > > On Tue, Feb 18, 2020 at 07:01:44PM -0500, Joel Fernandes wrote: > > > On Tue, Feb 18, 2020 at 02:54:55PM -0800, Paul E. McKenney wrote: > > > > On Tue, Feb 18, 2020 at 05:45:03PM -0500, Steven Rostedt wrote: > > > > > On Tue, 18 Feb 2020 12:22:26 -0800 > > > > > "Paul E. McKenney" <paulmck@kernel.org> wrote: > > > > > > > > > > > On Tue, Feb 18, 2020 at 09:11:42PM +0100, Peter Zijlstra wrote: > > > > > > > On Tue, Feb 18, 2020 at 08:27:19AM -0800, Paul E. McKenney wrote: > > > > > > > > On Tue, Feb 18, 2020 at 08:56:48AM +0100, Peter Zijlstra wrote: > > > > > > > > > > > > > > > > I just took offence at the Changelog wording. It seems to suggest there > > > > > > > > > actually is a problem, there is not. > > > > > > > > > > > > > > > > Quoting the changelog: "Not appropriate for backporting due to failure > > > > > > > > being unlikely." > > > > > > > > > > > > > > That implies there is failure, however unlikely. > > > > > > > > > > > > > > In this particular case there is absolutely no failure, except perhaps > > > > > > > in KCSAN. This patch is a pure annotation such that KCSAN can understand > > > > > > > the code. > > > > > > > > > > > > > > Like said, I don't object to the actual patch, but I do think it is > > > > > > > important to call out false negatives or to describe the actual problem > > > > > > > found. > > > > > > > > > > > > I don't feel at all comfortable declaring that there is absolutely > > > > > > no possibility of failure. > > > > > > > > > > Perhaps wording it like so: > > > > > > > > > > "There's know known issue with the current code, but the *_ONCE() > > > > > annotations here makes KCSAN happy, allowing us to focus on KCSAN > > > > > warnings that can help bring about known issues in other code that we > > > > > can fix, without being distracted by KCSAN warnings that we do not see > > > > > a problem with." > > > > > > > > > > ? > > > > > > > > That sounds more like something I might put in rcutodo.html as a statement > > > > of the RCU approach to KCSAN reports. > > > > > > > > But switching to a different situation (for variety, if nothing else), > > > > what about the commit shown below? > > > > > > > > Thanx, Paul > > > > > > > > ------------------------------------------------------------------------ > > > > > > > > commit 35bc02b04a041f32470ae6d959c549bcce8483db > > > > Author: Paul E. McKenney <paulmck@kernel.org> > > > > Date: Tue Feb 18 13:41:02 2020 -0800 > > > > > > > > rcutorture: Mark data-race potential for rcu_barrier() test statistics > > > > > > > > The n_barrier_successes, n_barrier_attempts, and > > > > n_rcu_torture_barrier_error variables are updated (without access > > > > markings) by the main rcu_barrier() test kthread, and accessed (also > > > > without access markings) by the rcu_torture_stats() kthread. This of > > > > course can result in KCSAN complaints. > > > > > > > > Because the accesses are in diagnostic prints, this commit uses > > > > data_race() to excuse the diagnostic prints from the data race. If this > > > > were to ever cause bogus statistics prints (for example, due to store > > > > tearing), any misleading information would be disambiguated by the > > > > presence or absence of an rcutorture splat. > > > > > > > > This data race was reported by KCSAN. Not appropriate for backporting > > > > due to failure being unlikely and due to the mild consequences of the > > > > failure, namely a confusing rcutorture console message. > > > > > > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > > > > > > > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c > > > > index 5453bd5..b3301f3 100644 > > > > --- a/kernel/rcu/rcutorture.c > > > > +++ b/kernel/rcu/rcutorture.c > > > > @@ -1444,9 +1444,9 @@ rcu_torture_stats_print(void) > > > > atomic_long_read(&n_rcu_torture_timers)); > > > > torture_onoff_stats(); > > > > pr_cont("barrier: %ld/%ld:%ld\n", > > > > - n_barrier_successes, > > > > - n_barrier_attempts, > > > > - n_rcu_torture_barrier_error); > > > > + data_race(n_barrier_successes), > > > > + data_race(n_barrier_attempts), > > > > + data_race(n_rcu_torture_barrier_error)); > > > > > > Would it be not worth just fixing the data-race within rcutorture itself? > > > > I could use WRITE_ONCE() for updates and READ_ONCE() for statistics. > > However, my current rule is that diagnostic code that is not participating > > in the core synchronization uses data_race(). That way, if I do a typo > > and write to (say) n_barrier_attempts in some other thread, KCSAN will > > know to yell at me. > > Oh, ok. That makes sense. > > Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Applied, thank you! Thanx, Paul ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-17 12:38 ` Peter Zijlstra 2020-02-17 18:16 ` Paul E. McKenney @ 2020-02-17 18:23 ` Joel Fernandes 2020-02-17 18:38 ` Marco Elver 1 sibling, 1 reply; 23+ messages in thread From: Joel Fernandes @ 2020-02-17 18:23 UTC (permalink / raw) To: Peter Zijlstra Cc: paulmck, rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, rostedt, dhowells, edumazet, fweisbec, oleg, elver On Mon, Feb 17, 2020 at 01:38:51PM +0100, Peter Zijlstra wrote: > On Fri, Feb 14, 2020 at 04:25:18PM -0800, paulmck@kernel.org wrote: > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > The RCU tasks list of callbacks, rcu_tasks_cbs_head, is sampled locklessly > > by rcu_tasks_kthread() when waiting for work to do. This commit therefore > > applies READ_ONCE() to that lockless sampling and WRITE_ONCE() to the > > single potential store outside of rcu_tasks_kthread. > > > > This data race was reported by KCSAN. Not appropriate for backporting > > due to failure being unlikely. > > What failure is possible here? AFAICT this is (again) one of them > load-complare-against-constant-discard patterns that are impossible to > mess up. You mean that because we are only testing for NULL, so load/store tearing of rcu_tasks_cbs_head is not an issue right? I agree. Even with invented stores, worst case we have a false-wakeup and go right back to sleep. Or, we read a partial rcu_tasks_cbs_head, and then go acquire the lock and read the whole thing correctly under lock. I wonder if we can teach KCSAN to actually ignore this kind of situation so we don't need to employ READ_ONCE() for no reason. Basically ask it to not bother if the read was only NULL-testing. +Marco since it is KCSAN related. thanks, - Joel > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > --- > > kernel/rcu/update.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c > > index 6c4b862..a27df76 100644 > > --- a/kernel/rcu/update.c > > +++ b/kernel/rcu/update.c > > @@ -528,7 +528,7 @@ void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func) > > rhp->func = func; > > raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags); > > needwake = !rcu_tasks_cbs_head; > > - *rcu_tasks_cbs_tail = rhp; > > + WRITE_ONCE(*rcu_tasks_cbs_tail, rhp); > > rcu_tasks_cbs_tail = &rhp->next; > > raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags); > > /* We can't create the thread unless interrupts are enabled. */ > > @@ -658,7 +658,7 @@ static int __noreturn rcu_tasks_kthread(void *arg) > > /* If there were none, wait a bit and start over. */ > > if (!list) { > > wait_event_interruptible(rcu_tasks_cbs_wq, > > - rcu_tasks_cbs_head); > > + READ_ONCE(rcu_tasks_cbs_head)); > > if (!rcu_tasks_cbs_head) { > > WARN_ON(signal_pending(current)); > > schedule_timeout_interruptible(HZ/10); > > -- > > 2.9.5 > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-17 18:23 ` Joel Fernandes @ 2020-02-17 18:38 ` Marco Elver 2020-02-17 19:32 ` Joel Fernandes 0 siblings, 1 reply; 23+ messages in thread From: Marco Elver @ 2020-02-17 18:38 UTC (permalink / raw) To: Joel Fernandes Cc: Peter Zijlstra, Paul E. McKenney, rcu, LKML, kernel-team, Ingo Molnar, Lai Jiangshan, dipankar, Andrew Morton, Mathieu Desnoyers, Josh Triplett, Thomas Gleixner, Steven Rostedt, David Howells, Eric Dumazet, fweisbec, Oleg Nesterov On Mon, 17 Feb 2020 at 19:23, Joel Fernandes <joel@joelfernandes.org> wrote: > > On Mon, Feb 17, 2020 at 01:38:51PM +0100, Peter Zijlstra wrote: > > On Fri, Feb 14, 2020 at 04:25:18PM -0800, paulmck@kernel.org wrote: > > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > > > The RCU tasks list of callbacks, rcu_tasks_cbs_head, is sampled locklessly > > > by rcu_tasks_kthread() when waiting for work to do. This commit therefore > > > applies READ_ONCE() to that lockless sampling and WRITE_ONCE() to the > > > single potential store outside of rcu_tasks_kthread. > > > > > > This data race was reported by KCSAN. Not appropriate for backporting > > > due to failure being unlikely. > > > > What failure is possible here? AFAICT this is (again) one of them > > load-complare-against-constant-discard patterns that are impossible to > > mess up. > > You mean that because we are only testing for NULL, so load/store tearing of > rcu_tasks_cbs_head is not an issue right? > > I agree. Even with invented stores, worst case we have a false-wakeup and go > right back to sleep. Or, we read a partial rcu_tasks_cbs_head, and then go > acquire the lock and read the whole thing correctly under lock. > > I wonder if we can teach KCSAN to actually ignore this kind of situation so > we don't need to employ READ_ONCE() for no reason. Basically ask it to not > bother if the read was only NULL-testing. +Marco since it is KCSAN related. This came up before. It requires somehow making the compiler tell us what type of operation we're doing and in what context: https://lore.kernel.org/lkml/CANpmjNNZQsatHexXHm4dXvA0na6r9xMgVD5R+-8d7VXEBRi32w@mail.gmail.com/ In particular: > > This particular rule relies on semantic analysis that is beyond what > > the TSAN instrumentation currently supports. Right now we support GCC > > and Clang; changing the compiler probably means we'd end up with only > > one (probably Clang), and many more years before the change has > > propagated to the majority of used compiler versions. It'd be good if > > we can do this purely as a change in the kernel's codebase. Load/store tearing might not be an issue, but we also have to be aware of things like load fusing, e.g. in a loop. That being said, there may be ways to get similar results without yet changing the compiler. Thanks, -- Marco > thanks, > > - Joel > > > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > > --- > > > kernel/rcu/update.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c > > > index 6c4b862..a27df76 100644 > > > --- a/kernel/rcu/update.c > > > +++ b/kernel/rcu/update.c > > > @@ -528,7 +528,7 @@ void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func) > > > rhp->func = func; > > > raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags); > > > needwake = !rcu_tasks_cbs_head; > > > - *rcu_tasks_cbs_tail = rhp; > > > + WRITE_ONCE(*rcu_tasks_cbs_tail, rhp); > > > rcu_tasks_cbs_tail = &rhp->next; > > > raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags); > > > /* We can't create the thread unless interrupts are enabled. */ > > > @@ -658,7 +658,7 @@ static int __noreturn rcu_tasks_kthread(void *arg) > > > /* If there were none, wait a bit and start over. */ > > > if (!list) { > > > wait_event_interruptible(rcu_tasks_cbs_wq, > > > - rcu_tasks_cbs_head); > > > + READ_ONCE(rcu_tasks_cbs_head)); > > > if (!rcu_tasks_cbs_head) { > > > WARN_ON(signal_pending(current)); > > > schedule_timeout_interruptible(HZ/10); > > > -- > > > 2.9.5 > > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head 2020-02-17 18:38 ` Marco Elver @ 2020-02-17 19:32 ` Joel Fernandes 0 siblings, 0 replies; 23+ messages in thread From: Joel Fernandes @ 2020-02-17 19:32 UTC (permalink / raw) To: Marco Elver Cc: Peter Zijlstra, Paul E. McKenney, rcu, LKML, kernel-team, Ingo Molnar, Lai Jiangshan, dipankar, Andrew Morton, Mathieu Desnoyers, Josh Triplett, Thomas Gleixner, Steven Rostedt, David Howells, Eric Dumazet, fweisbec, Oleg Nesterov On Mon, Feb 17, 2020 at 07:38:01PM +0100, Marco Elver wrote: > On Mon, 17 Feb 2020 at 19:23, Joel Fernandes <joel@joelfernandes.org> wrote: > > > > On Mon, Feb 17, 2020 at 01:38:51PM +0100, Peter Zijlstra wrote: > > > On Fri, Feb 14, 2020 at 04:25:18PM -0800, paulmck@kernel.org wrote: > > > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > > > > > The RCU tasks list of callbacks, rcu_tasks_cbs_head, is sampled locklessly > > > > by rcu_tasks_kthread() when waiting for work to do. This commit therefore > > > > applies READ_ONCE() to that lockless sampling and WRITE_ONCE() to the > > > > single potential store outside of rcu_tasks_kthread. > > > > > > > > This data race was reported by KCSAN. Not appropriate for backporting > > > > due to failure being unlikely. > > > > > > What failure is possible here? AFAICT this is (again) one of them > > > load-complare-against-constant-discard patterns that are impossible to > > > mess up. > > > > You mean that because we are only testing for NULL, so load/store tearing of > > rcu_tasks_cbs_head is not an issue right? > > > > I agree. Even with invented stores, worst case we have a false-wakeup and go > > right back to sleep. Or, we read a partial rcu_tasks_cbs_head, and then go > > acquire the lock and read the whole thing correctly under lock. > > > > I wonder if we can teach KCSAN to actually ignore this kind of situation so > > we don't need to employ READ_ONCE() for no reason. Basically ask it to not > > bother if the read was only NULL-testing. +Marco since it is KCSAN related. > > This came up before. It requires somehow making the compiler tell us > what type of operation we're doing and in what context: > https://lore.kernel.org/lkml/CANpmjNNZQsatHexXHm4dXvA0na6r9xMgVD5R+-8d7VXEBRi32w@mail.gmail.com/ Oh, wow. Ok. > In particular: > > > > This particular rule relies on semantic analysis that is beyond what > > > the TSAN instrumentation currently supports. Right now we support GCC > > > and Clang; changing the compiler probably means we'd end up with only > > > one (probably Clang), and many more years before the change has > > > propagated to the majority of used compiler versions. It'd be good if > > > we can do this purely as a change in the kernel's codebase. > > Load/store tearing might not be an issue, but we also have to be aware > of things like load fusing, e.g. in a loop. Ok. Makes sense. > That being said, there may be ways to get similar results without yet > changing the compiler. Interesting. thanks! - Joel > Thanks, > -- Marco > > > thanks, > > > > - Joel > > > > > > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > > > --- > > > > kernel/rcu/update.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c > > > > index 6c4b862..a27df76 100644 > > > > --- a/kernel/rcu/update.c > > > > +++ b/kernel/rcu/update.c > > > > @@ -528,7 +528,7 @@ void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func) > > > > rhp->func = func; > > > > raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags); > > > > needwake = !rcu_tasks_cbs_head; > > > > - *rcu_tasks_cbs_tail = rhp; > > > > + WRITE_ONCE(*rcu_tasks_cbs_tail, rhp); > > > > rcu_tasks_cbs_tail = &rhp->next; > > > > raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags); > > > > /* We can't create the thread unless interrupts are enabled. */ > > > > @@ -658,7 +658,7 @@ static int __noreturn rcu_tasks_kthread(void *arg) > > > > /* If there were none, wait a bit and start over. */ > > > > if (!list) { > > > > wait_event_interruptible(rcu_tasks_cbs_wq, > > > > - rcu_tasks_cbs_head); > > > > + READ_ONCE(rcu_tasks_cbs_head)); > > > > if (!rcu_tasks_cbs_head) { > > > > WARN_ON(signal_pending(current)); > > > > schedule_timeout_interruptible(HZ/10); > > > > -- > > > > 2.9.5 > > > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH tip/core/rcu 2/3] rcu: Add missing annotation for exit_tasks_rcu_start() 2020-02-15 0:24 [PATCH tip/core/rcu 0/3] Tasks-RCU updates for v5.7 Paul E. McKenney 2020-02-15 0:25 ` [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head paulmck @ 2020-02-15 0:25 ` paulmck 2020-02-17 14:44 ` Joel Fernandes 2020-02-15 0:25 ` [PATCH tip/core/rcu 3/3] rcu: Add missing annotation for exit_tasks_rcu_finish() paulmck 2 siblings, 1 reply; 23+ messages in thread From: paulmck @ 2020-02-15 0:25 UTC (permalink / raw) To: rcu Cc: linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel, Jules Irenge, Paul E . McKenney From: Jules Irenge <jbi.octave@gmail.com> Sparse reports a warning at exit_tasks_rcu_start(void) |warning: context imbalance in exit_tasks_rcu_start() - wrong count at exit To fix this, this commit adds an __acquires(&tasks_rcu_exit_srcu). Given that exit_tasks_rcu_start() does actually call __srcu_read_lock(), this not only fixes the warning but also improves on the readability of the code. Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> --- kernel/rcu/update.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index a27df76..a04fe54 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -801,7 +801,7 @@ static int __init rcu_spawn_tasks_kthread(void) core_initcall(rcu_spawn_tasks_kthread); /* Do the srcu_read_lock() for the above synchronize_srcu(). */ -void exit_tasks_rcu_start(void) +void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu) { preempt_disable(); current->rcu_tasks_idx = __srcu_read_lock(&tasks_rcu_exit_srcu); -- 2.9.5 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 2/3] rcu: Add missing annotation for exit_tasks_rcu_start() 2020-02-15 0:25 ` [PATCH tip/core/rcu 2/3] rcu: Add missing annotation for exit_tasks_rcu_start() paulmck @ 2020-02-17 14:44 ` Joel Fernandes 2020-02-17 23:10 ` Paul E. McKenney 0 siblings, 1 reply; 23+ messages in thread From: Joel Fernandes @ 2020-02-17 14:44 UTC (permalink / raw) To: paulmck Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, Jules Irenge On Fri, Feb 14, 2020 at 04:25:19PM -0800, paulmck@kernel.org wrote: > From: Jules Irenge <jbi.octave@gmail.com> > > Sparse reports a warning at exit_tasks_rcu_start(void) > > |warning: context imbalance in exit_tasks_rcu_start() - wrong count at exit > > To fix this, this commit adds an __acquires(&tasks_rcu_exit_srcu). > Given that exit_tasks_rcu_start() does actually call __srcu_read_lock(), > this not only fixes the warning but also improves on the readability of > the code. For patch 1/3 and 2/3: Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Though IMO it would be good to squash both the patches. thanks, - Joel > Signed-off-by: Jules Irenge <jbi.octave@gmail.com> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > --- > kernel/rcu/update.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c > index a27df76..a04fe54 100644 > --- a/kernel/rcu/update.c > +++ b/kernel/rcu/update.c > @@ -801,7 +801,7 @@ static int __init rcu_spawn_tasks_kthread(void) > core_initcall(rcu_spawn_tasks_kthread); > > /* Do the srcu_read_lock() for the above synchronize_srcu(). */ > -void exit_tasks_rcu_start(void) > +void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu) > { > preempt_disable(); > current->rcu_tasks_idx = __srcu_read_lock(&tasks_rcu_exit_srcu); > -- > 2.9.5 > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH tip/core/rcu 2/3] rcu: Add missing annotation for exit_tasks_rcu_start() 2020-02-17 14:44 ` Joel Fernandes @ 2020-02-17 23:10 ` Paul E. McKenney 0 siblings, 0 replies; 23+ messages in thread From: Paul E. McKenney @ 2020-02-17 23:10 UTC (permalink / raw) To: Joel Fernandes Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, Jules Irenge On Mon, Feb 17, 2020 at 09:44:52AM -0500, Joel Fernandes wrote: > On Fri, Feb 14, 2020 at 04:25:19PM -0800, paulmck@kernel.org wrote: > > From: Jules Irenge <jbi.octave@gmail.com> > > > > Sparse reports a warning at exit_tasks_rcu_start(void) > > > > |warning: context imbalance in exit_tasks_rcu_start() - wrong count at exit > > > > To fix this, this commit adds an __acquires(&tasks_rcu_exit_srcu). > > Given that exit_tasks_rcu_start() does actually call __srcu_read_lock(), > > this not only fixes the warning but also improves on the readability of > > the code. > > For patch 1/3 and 2/3: > > Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Applied, thank you! > Though IMO it would be good to squash both the patches. Fair point, but I will leave them be. ;-) Thanx, Paul > thanks, > > - Joel > > > > Signed-off-by: Jules Irenge <jbi.octave@gmail.com> > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > --- > > kernel/rcu/update.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c > > index a27df76..a04fe54 100644 > > --- a/kernel/rcu/update.c > > +++ b/kernel/rcu/update.c > > @@ -801,7 +801,7 @@ static int __init rcu_spawn_tasks_kthread(void) > > core_initcall(rcu_spawn_tasks_kthread); > > > > /* Do the srcu_read_lock() for the above synchronize_srcu(). */ > > -void exit_tasks_rcu_start(void) > > +void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu) > > { > > preempt_disable(); > > current->rcu_tasks_idx = __srcu_read_lock(&tasks_rcu_exit_srcu); > > -- > > 2.9.5 > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH tip/core/rcu 3/3] rcu: Add missing annotation for exit_tasks_rcu_finish() 2020-02-15 0:24 [PATCH tip/core/rcu 0/3] Tasks-RCU updates for v5.7 Paul E. McKenney 2020-02-15 0:25 ` [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head paulmck 2020-02-15 0:25 ` [PATCH tip/core/rcu 2/3] rcu: Add missing annotation for exit_tasks_rcu_start() paulmck @ 2020-02-15 0:25 ` paulmck 2 siblings, 0 replies; 23+ messages in thread From: paulmck @ 2020-02-15 0:25 UTC (permalink / raw) To: rcu Cc: linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm, mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells, edumazet, fweisbec, oleg, joel, Jules Irenge, Paul E . McKenney From: Jules Irenge <jbi.octave@gmail.com> Sparse reports a warning at exit_tasks_rcu_finish(void) |warning: context imbalance in exit_tasks_rcu_finish() |- wrong count at exit To fix this, this commit adds a __releases(&tasks_rcu_exit_srcu). Given that exit_tasks_rcu_finish() does actually call __srcu_read_lock(), this not only fixes the warning but also improves on the readability of the code. Signed-off-by: Jules Irenge <jbi.octave@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> --- kernel/rcu/update.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c index a04fe54..ede656c 100644 --- a/kernel/rcu/update.c +++ b/kernel/rcu/update.c @@ -809,7 +809,7 @@ void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu) } /* Do the srcu_read_unlock() for the above synchronize_srcu(). */ -void exit_tasks_rcu_finish(void) +void exit_tasks_rcu_finish(void) __releases(&tasks_rcu_exit_srcu) { preempt_disable(); __srcu_read_unlock(&tasks_rcu_exit_srcu, current->rcu_tasks_idx); -- 2.9.5 ^ permalink raw reply related [flat|nested] 23+ messages in thread
end of thread, other threads:[~2020-02-19 1:48 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-02-15 0:24 [PATCH tip/core/rcu 0/3] Tasks-RCU updates for v5.7 Paul E. McKenney 2020-02-15 0:25 ` [PATCH tip/core/rcu 1/3] rcu-tasks: *_ONCE() for rcu_tasks_cbs_head paulmck 2020-02-17 12:38 ` Peter Zijlstra 2020-02-17 18:16 ` Paul E. McKenney 2020-02-18 7:56 ` Peter Zijlstra 2020-02-18 16:27 ` Paul E. McKenney 2020-02-18 20:11 ` Peter Zijlstra 2020-02-18 20:22 ` Paul E. McKenney 2020-02-18 22:45 ` Steven Rostedt 2020-02-18 22:54 ` Paul E. McKenney 2020-02-18 23:13 ` Steven Rostedt 2020-02-18 23:54 ` Paul E. McKenney 2020-02-19 0:01 ` Joel Fernandes 2020-02-19 0:16 ` Paul E. McKenney 2020-02-19 1:13 ` Joel Fernandes 2020-02-19 1:48 ` Paul E. McKenney 2020-02-17 18:23 ` Joel Fernandes 2020-02-17 18:38 ` Marco Elver 2020-02-17 19:32 ` Joel Fernandes 2020-02-15 0:25 ` [PATCH tip/core/rcu 2/3] rcu: Add missing annotation for exit_tasks_rcu_start() paulmck 2020-02-17 14:44 ` Joel Fernandes 2020-02-17 23:10 ` Paul E. McKenney 2020-02-15 0:25 ` [PATCH tip/core/rcu 3/3] rcu: Add missing annotation for exit_tasks_rcu_finish() paulmck
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).