linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Russell King - ARM Linux admin <linux@armlinux.org.uk>,
	Chris Metcalf <cmetcalf@ezchip.com>,
	Christoph Lameter <cl@linux.com>, Kirill Tkhai <tkhai@yandex.ru>,
	Mike Galbraith <efault@gmx.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH v2 4/4] task: RCUify the assignment of rq->curr
Date: Thu, 26 Sep 2019 14:42:32 +0200	[thread overview]
Message-ID: <20190926124231.GA25572@lenoir> (raw)
In-Reply-To: <87k19vyggy.fsf@x220.int.ebiederm.org>

On Wed, Sep 25, 2019 at 08:49:17PM -0500, Eric W. Biederman wrote:
> Frederic Weisbecker <frederic@kernel.org> writes:
> 
> > On Sat, Sep 14, 2019 at 07:35:02AM -0500, Eric W. Biederman wrote:
> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> >> index 69015b7c28da..668262806942 100644
> >> --- a/kernel/sched/core.c
> >> +++ b/kernel/sched/core.c
> >> @@ -3857,7 +3857,11 @@ static void __sched notrace __schedule(bool preempt)
> >>  
> >>  	if (likely(prev != next)) {
> >>  		rq->nr_switches++;
> >> -		rq->curr = next;
> >> +		/*
> >> +		 * RCU users of rcu_dereference(rq->curr) may not see
> >> +		 * changes to task_struct made by pick_next_task().
> >> +		 */
> >> +		RCU_INIT_POINTER(rq->curr, next);
> >
> > It would be nice to have more explanations in the comments as to why we
> > don't use rcu_assign_pointer() here (the very fast-path issue) and why
> > it is expected to be fine (the rq_lock() + post spinlock barrier) under
> > which condition. Some short summary of the changelog. Because that line
> > implies way too many subtleties.
> 
> Crucially that line documents the standard rules don't apply,
> and it documents which guarantees a new user of the code can probably
> count on.  I say probably because the comment may go stale before I new
> user of rcu appears.  I have my hopes things are simple enough at that
> location that if the comment needs to be changed it can be.

At least I can't understand that line without referring to the changelog.

> 
> If it is not obvious from reading the code that calls
> "task_rcu_dereference(rq->curr)" now "rcu_dereference(rq->curr)" why we
> don't need the guarantees from rcu_assign_pointet() my sense is that
> it should be those locations that document what guarantees they need.

Both sides should probably have comments.

> 
> Of the several different locations that use this my sense is that they
> all have different requirements.
> 
> - The rcuwait code just needs the lifetime change as it never dereferences
>   rq->curr.
> 
> - The membarrier code just looks at rq->curr->mm for a moment so it
>   hardly needs anything.  I suspect we might be able to make the rcu
>   critical section smaller in that code.
> 
> - I don't know the code in task_numa_compare() well enough even to make an
>   educated guess.  Peter asserts (if I read his reply correctly) it is
>   all just a heuristic so stale values should not matter.
> 
>   My reading of the code strongly suggests that we have the ordinary
>   rcu_assign_pointer() guarantees there.  The few fields that are not
>   covered by the ordinary guarantees do not appear to be read.  So even
>   if Peter is wrong RCU_INIT_POINTER appears safe to me.
> 
>   I also don't think we will have confusion with people reading the
>   code and expecting ordinary rcu_dereference semantics().
> 
> I can't possibly see putting the above several lines in a meaningful
> comment where RCU_INIT_POINTER is called.  Especially in a comment
> that will survive changes to any of those functions.  My experience
> is comments that try that are almost always overlooked when someone
> updates the code.

That's ok, it's the nature of comments, they get out of date. But at
least they provide a link to history so we can rewind to find the initial
how and why for a tricky line.

I bet nobody wants git blame as a base for their text editors.

> 
> I barely found all of the comments that depended upon the details of
> task_rcu_dereference and updated them in my patchset, when I removed
> the need for task_rcu_dereference.
> 
> I don't think it would be wise to put a comment that is a wall of words
> in the middle of __schedule().  I think it will become inaccurate with
> time and because it is a lot of words I think it will be ignored.
> 
> 
> As for the __schedule: It is the heart of the scheduler.  It is
> performance code.  It is clever code.  It is likely to stay that way
> because it is the scheduler.  There are good technical reasons for the
> code is the way it is, and anyone changing the scheduler in a
> responsible manner that includes benchmarking should find those
> technical reasons quickly enough.
> 
> 
> So I think a quick word to the wise is enough.  Comments are certainly
> not enough to prevent people being careless and making foolish mistakes.

Well it's not even about preventing anything, it's only about making
a line of cryptic code understandable for reviewers. No need for thorough
details, indeed anyone making use of that code or modifying it has to dive
into the deep guts anyway.

So how about that:

/*
 * Avoid rcu_dereference() in this very fast path.
 * Instead rely on full barrier implied by rq_lock() + smp_mb__after_spinlock().
 * Warning: In-between writes may be missed by readers (eg: pick_next_task())
 */

Thanks.

  reply	other threads:[~2019-09-26 12:42 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-30 14:08 [BUG] Use of probe_kernel_address() in task_rcu_dereference() without checking return value Russell King - ARM Linux admin
2019-08-30 15:24 ` Oleg Nesterov
2019-08-30 15:30 ` Linus Torvalds
2019-08-30 15:40   ` Russell King - ARM Linux admin
2019-08-30 15:43     ` Linus Torvalds
2019-08-30 15:41   ` Linus Torvalds
2019-08-30 16:09     ` Oleg Nesterov
2019-08-30 16:21       ` Linus Torvalds
2019-08-30 16:44         ` Oleg Nesterov
2019-08-30 16:58           ` Linus Torvalds
2019-08-30 19:36         ` Eric W. Biederman
2019-09-02 13:40           ` Oleg Nesterov
2019-09-02 13:53             ` Peter Zijlstra
2019-09-02 14:44               ` Oleg Nesterov
2019-09-02 16:20                 ` Peter Zijlstra
2019-09-02 17:04             ` Eric W. Biederman
2019-09-02 17:34               ` Linus Torvalds
2019-09-03  4:50                 ` [PATCH 0/3] task: Making tasks on the runqueue rcu protected Eric W. Biederman
2019-09-03  4:51                   ` [PATCH 1/3] task: Add a count of task rcu users Eric W. Biederman
2019-09-04 14:36                     ` Oleg Nesterov
2019-09-04 14:44                     ` Frederic Weisbecker
2019-09-04 15:32                       ` Oleg Nesterov
2019-09-04 16:33                         ` Frederic Weisbecker
2019-09-04 18:20                           ` Linus Torvalds
2019-09-05 14:59                             ` Frederic Weisbecker
2019-09-03  4:52                   ` [PATCH 2/3] task: RCU protect tasks on the runqueue Eric W. Biederman
2019-09-03  7:41                     ` Peter Zijlstra
2019-09-03  7:47                       ` Peter Zijlstra
2019-09-03 16:44                         ` Eric W. Biederman
2019-09-03 17:08                           ` Linus Torvalds
2019-09-03 18:13                             ` Eric W. Biederman
2019-09-03 19:18                               ` Linus Torvalds
2019-09-03 20:06                                 ` Peter Zijlstra
2019-09-03 21:32                                   ` Paul E. McKenney
2019-09-05 20:02                                     ` Eric W. Biederman
2019-09-05 20:55                                       ` Paul E. McKenney
2019-09-06  7:07                                       ` Peter Zijlstra
2019-09-09 12:22                                         ` Eric W. Biederman
2019-09-25  7:36                                           ` Peter Zijlstra
2019-09-27  8:10                                   ` [tip: sched/urgent] tasks, sched/core: RCUify the assignment of rq->curr tip-bot2 for Eric W. Biederman
2019-09-03 19:42                               ` [PATCH 2/3] task: RCU protect tasks on the runqueue Peter Zijlstra
2019-09-14 12:31                           ` [PATCH v2 1/4] task: Add a count of task rcu users Eric W. Biederman
2019-09-14 12:31                           ` [PATCH v2 2/4] task: Ensure tasks are available for a grace period after leaving the runqueue Eric W. Biederman
2019-09-14 12:32                           ` [PATCH v2 3/4] task: With a grace period after finish_task_switch, remove unnecessary code Eric W. Biederman
2019-09-04 14:22                     ` [PATCH 2/3] task: RCU protect tasks on the runqueue Frederic Weisbecker
2019-09-03  4:52                   ` [PATCH 3/3] task: Clean house now that tasks on the runqueue are rcu protected Eric W. Biederman
2019-09-03  9:45                     ` kbuild test robot
2019-09-03 13:06                     ` Oleg Nesterov
2019-09-03 13:58                   ` [PATCH 0/3] task: Making tasks on the runqueue " Oleg Nesterov
2019-09-03 15:44                   ` Linus Torvalds
2019-09-03 19:46                     ` Peter Zijlstra
     [not found]                   ` <87muf7f4bf.fsf_-_@x220.int.ebiederm.org>
2019-09-14 12:33                     ` [PATCH v2 1/4] task: Add a count of task rcu users Eric W. Biederman
2019-09-15 13:54                       ` Paul E. McKenney
2019-09-27  8:10                       ` [tip: sched/urgent] tasks: Add a count of task RCU users tip-bot2 for Eric W. Biederman
2019-09-14 12:33                     ` [PATCH v2 2/4] task: Ensure tasks are available for a grace period after leaving the runqueue Eric W. Biederman
2019-09-15 14:07                       ` Paul E. McKenney
2019-09-15 14:09                         ` Paul E. McKenney
2019-09-27  8:10                       ` [tip: sched/urgent] tasks, sched/core: " tip-bot2 for Eric W. Biederman
2019-09-14 12:34                     ` [PATCH v2 3/4] task: With a grace period after finish_task_switch, remove unnecessary code Eric W. Biederman
2019-09-15 14:32                       ` Paul E. McKenney
2019-09-15 17:07                         ` Linus Torvalds
2019-09-15 18:47                           ` Paul E. McKenney
2019-09-27  8:10                       ` [tip: sched/urgent] tasks, sched/core: With a grace period after finish_task_switch(), " tip-bot2 for Eric W. Biederman
2019-09-14 12:35                     ` [PATCH v2 4/4] task: RCUify the assignment of rq->curr Eric W. Biederman
2019-09-15 14:41                       ` Paul E. McKenney
2019-09-15 17:59                         ` Eric W. Biederman
2019-09-15 18:25                           ` Eric W. Biederman
2019-09-15 18:48                             ` Paul E. McKenney
2019-09-20 23:02                       ` Frederic Weisbecker
2019-09-26  1:49                         ` Eric W. Biederman
2019-09-26 12:42                           ` Frederic Weisbecker [this message]
2019-09-14 17:43                     ` [PATCH v2 0/4] task: Making tasks on the runqueue rcu protected Linus Torvalds
2019-09-17 17:38                       ` Eric W. Biederman
2019-09-25  7:51                         ` Peter Zijlstra
2019-09-26  1:11                           ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190926124231.GA25572@lenoir \
    --to=frederic@kernel.org \
    --cc=cl@linux.com \
    --cc=cmetcalf@ezchip.com \
    --cc=dave@stgolabs.net \
    --cc=ebiederm@xmission.com \
    --cc=efault@gmx.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tkhai@yandex.ru \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).