All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org, mingo@kernel.org,
	dipankar@in.ibm.com, akpm@linux-foundation.org,
	mathieu.desnoyers@efficios.com, josh@joshtriplett.org,
	tglx@linutronix.de, rostedt@goodmis.org, dhowells@redhat.com,
	edumazet@google.com, dvhart@linux.intel.com, fweisbec@gmail.com,
	oleg@redhat.com, bobby.prani@gmail.com
Subject: Re: [PATCH v3 tip/core/rcu 1/9] rcu: Add call_rcu_tasks()
Date: Wed, 6 Aug 2014 15:48:32 -0700	[thread overview]
Message-ID: <20140806224832.GA30085@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140806005101.GW8101@linux.vnet.ibm.com>

On Tue, Aug 05, 2014 at 05:51:01PM -0700, Paul E. McKenney wrote:
> On Wed, Aug 06, 2014 at 08:33:29AM +0800, Lai Jiangshan wrote:
> > On 08/06/2014 05:55 AM, Paul E. McKenney wrote:
> > > On Tue, Aug 05, 2014 at 08:47:55AM +0800, Lai Jiangshan wrote:
> > >> On 08/04/2014 10:56 PM, Peter Zijlstra wrote:
> > >>> On Mon, Aug 04, 2014 at 02:25:15PM +0200, Peter Zijlstra wrote:
> > >>>> On Mon, Aug 04, 2014 at 04:50:44AM -0700, Paul E. McKenney wrote:
> > >>>>> OK, I will bite...
> > >>>>>
> > >>>>> What kinds of tasks are on a runqueue, but neither ->on_cpu nor
> > >>>>> PREEMPT_ACTIVE?
> > >>>>
> > >>>> Userspace tasks, they don't necessarily get PREEMPT_ACTIVE when
> > >>>> preempted. Now obviously you're not _that_ interested in userspace tasks
> > >>>> for this, so that might be ok.
> > >>>>
> > >>>> But the main point was, you cannot use ->on_cpu or PREEMPT_ACTIVE
> > >>>> without holding rq->lock.
> > >>>
> > >>> Hmm, maybe you can, we have the context switch in between setting
> > >>> ->on_cpu and clearing PREEMPT_ACTIVE and vice-versa.
> > >>>
> > >>> The context switch (obviously) provides a full barrier, so we might be
> > >>> able to -- with careful consideration -- read these two separate values
> > >>> and construct something usable from them.
> > >>>
> > >>> Something like:
> > >>>
> > >>> 	task_preempt_count(tsk) & PREEMPT_ACTIVE
> > >> 	the @tsk is running on_cpu, the above result is false.
> > >>> 	smp_rmb();
> > >>> 	tsk->on_cpu
> > >> 	now the @tsk is preempted, the above result also is false.
> > >>
> > >> 	so it is useless if we fetch the preempt_count and on_cpu in two separated
> > >> instructions.  Maybe it would work if we also take tsk->nivcsw in consideration.
> > >> (I just noticed that tsk->n[i]vcsw are the version numbers for the tsk->on_cpu)
> > >>
> > >> bool task_on_cpu_or_preempted(tsk)
> > >> {
> > >> 	unsigned long saved_nivcsw;
> > >>
> > >> 	saved_nivcsw = task->nivcsw;
> > >> 	if (tsk->on_cpu)
> > >> 		return true;
> > >>
> > >> 	smp_rmb();
> > >>
> > >> 	if (task_preempt_count(tsk) & PREEMPT_ACTIVE)
> > >> 		return true;
> > >>
> > >> 	smp_rmb();
> > >>
> > >> 	if (tsk->on_cpu || task->nivcsw != saved_nivcsw)
> > >> 		return true;
> > >>
> > >> 	return false;
> > >> }
> > >>
> > >>>
> > >>> And because we set PREEMPT_ACTIVE before clearing on_cpu, this should
> > >>> race the right way (err towards the inclusive side).
> > >>>
> > >>> Obviously that wants a big fat comment...
> > > 
> > > How about the following?  Non-nohz_full userspace tasks are already covered
> > > courtesy of scheduling-clock interrupts, and this handles nohz_full usermode
> > > tasks.
> > > 
> > > Thoughts?
> > > 
> > > 							Thanx, Paul
> > > 
> > > ------------------------------------------------------------------------
> > > 
> > > rcu: Make TASKS_RCU handle nohx_full= CPUs
> > > 
> > > Currently TASKS_RCU would ignore a CPU running a task in nohz_full=
> > > usermode execution.  There would be neither a context switch nor a
> > > scheduling-clock interrupt to tell TASKS_RCU that the task in question
> > > had passed through a quiescent state.  The grace period would therefore
> > > extend indefinitely.  This commit therefore makes RCU's dyntick-idle
> > > subsystem record the task_struct structure of the task that is running
> > > in dyntick-idle mode on each CPU.  The TASKS_RCU grace period can
> > > then access this information and record a quiescent state on
> > > behalf of any CPU running in dyntick-idle usermode.
> > > 
> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > 
> > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > index f504f797c9c8..777aac3a34c0 100644
> > > --- a/include/linux/rcupdate.h
> > > +++ b/include/linux/rcupdate.h
> > > @@ -1140,5 +1140,14 @@ static inline void rcu_sysidle_force_exit(void)
> > >  
> > >  #endif /* #else #ifdef CONFIG_NO_HZ_FULL_SYSIDLE */
> > >  
> > > +#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
> > > +struct task_struct *rcu_dynticks_task_cur(int cpu);
> > > +#else /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
> > > +static inline struct task_struct *rcu_dynticks_task_cur(int cpu)
> > > +{
> > > +	return NULL;
> > > +}
> > > +#endif /* #else #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
> > > +
> > >  
> > >  #endif /* __LINUX_RCUPDATE_H */
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 645a33efc0d4..86a0a7d5bbbd 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -526,6 +526,7 @@ static void rcu_eqs_enter_common(struct rcu_dynticks *rdtp, long long oldval,
> > >  	atomic_inc(&rdtp->dynticks);
> > >  	smp_mb__after_atomic();  /* Force ordering with next sojourn. */
> > >  	WARN_ON_ONCE(atomic_read(&rdtp->dynticks) & 0x1);
> > > +	rcu_dynticks_task_enter(rdtp, current);
> > >  
> > >  	/*
> > >  	 * It is illegal to enter an extended quiescent state while
> > > @@ -642,6 +643,7 @@ void rcu_irq_exit(void)
> > >  static void rcu_eqs_exit_common(struct rcu_dynticks *rdtp, long long oldval,
> > >  			       int user)
> > >  {
> > > +	rcu_dynticks_task_exit(rdtp);
> > 
> > What happen when the trampoline happens before rcu_eqs_exit_common()?
> > synchronize_sched() also skip this CPUs.  I think, for all CPUs,  only
> > real schedule is reliable.
> 
> True, this prohibits tracing the point from the call to
> rcu_eqs_enter_common() to the transition to usermode.  I am betting that
> this is OK, though.

And if not, it is not all that hard to handle this case.

							Thanx, Paul

> > >  	smp_mb__before_atomic();  /* Force ordering w/previous sojourn. */
> > >  	atomic_inc(&rdtp->dynticks);
> > >  	/* CPUs seeing atomic_inc() must see later RCU read-side crit sects */
> > > diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
> > > index 0f69a79c5b7d..1e79fa1b7cbf 100644
> > > --- a/kernel/rcu/tree.h
> > > +++ b/kernel/rcu/tree.h
> > > @@ -88,6 +88,9 @@ struct rcu_dynticks {
> > >  				    /* Process level is worth LLONG_MAX/2. */
> > >  	int dynticks_nmi_nesting;   /* Track NMI nesting level. */
> > >  	atomic_t dynticks;	    /* Even value for idle, else odd. */
> > > +#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
> > > +	struct task_struct *dynticks_tsk;
> > > +#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
> > >  #ifdef CONFIG_NO_HZ_FULL_SYSIDLE
> > >  	long long dynticks_idle_nesting;
> > >  				    /* irq/process nesting level from idle. */
> > > @@ -579,6 +582,9 @@ static void rcu_sysidle_report_gp(struct rcu_state *rsp, int isidle,
> > >  static void rcu_bind_gp_kthread(void);
> > >  static void rcu_sysidle_init_percpu_data(struct rcu_dynticks *rdtp);
> > >  static bool rcu_nohz_full_cpu(struct rcu_state *rsp);
> > > +static void rcu_dynticks_task_enter(struct rcu_dynticks *rdtp,
> > > +				    struct task_struct *t);
> > > +static void rcu_dynticks_task_exit(struct rcu_dynticks *rdtp);
> > >  
> > >  #endif /* #ifndef RCU_TREE_NONCORE */
> > >  
> > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > > index a86a363ea453..442d62edc564 100644
> > > --- a/kernel/rcu/tree_plugin.h
> > > +++ b/kernel/rcu/tree_plugin.h
> > > @@ -2852,3 +2852,29 @@ static void rcu_bind_gp_kthread(void)
> > >  		set_cpus_allowed_ptr(current, cpumask_of(cpu));
> > >  #endif /* #ifdef CONFIG_NO_HZ_FULL */
> > >  }
> > > +
> > > +/* Record the current task on dyntick-idle entry. */
> > > +static void rcu_dynticks_task_enter(struct rcu_dynticks *rdtp,
> > > +				    struct task_struct *t)
> > > +{
> > > +#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
> > > +	ACCESS_ONCE(rdtp->dynticks_tsk) = t;
> > > +#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
> > > +}
> > > +
> > > +/* Record no current task on dyntick-idle exit. */
> > > +static void rcu_dynticks_task_exit(struct rcu_dynticks *rdtp)
> > > +{
> > > +	rcu_dynticks_task_enter(rdtp, NULL);
> > > +}
> > > +
> > > +#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
> > > +struct task_struct *rcu_dynticks_task_cur(int cpu)
> > > +{
> > > +	struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu);
> > > +	struct task_struct *t = ACCESS_ONCE(rdtp->dynticks_tsk);
> > > +
> > > +	smp_read_barrier_depends(); /* Dereferences after fetch of "t". */
> > > +	return t;
> > > +}
> > > +#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
> > > diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> > > index ad2a8df43757..6ad6af2ab028 100644
> > > --- a/kernel/rcu/update.c
> > > +++ b/kernel/rcu/update.c
> > > @@ -481,6 +481,28 @@ static void check_holdout_task(struct task_struct *t,
> > >  	sched_show_task(current);
> > >  }
> > >  
> > > +/* Check for nohz_full CPUs executing in userspace. */
> > > +static void check_no_hz_full_tasks(void)
> > > +{
> > > +#ifdef CONFIG_NO_HZ_FULL
> > > +	int cpu;
> > > +	struct task_struct *t;
> > > +
> > > +	for_each_online_cpu(cpu) {
> > > +		cond_resched();
> > > +		rcu_read_lock();
> > > +		t = rcu_dynticks_task_cur(cpu);
> > > +		if (t == NULL || is_idle_task(t)) {
> > > +			rcu_read_unlock();
> > > +			continue;
> > > +		}
> > > +		if (ACCESS_ONCE(t->rcu_tasks_holdout))
> > > +			ACCESS_ONCE(t->rcu_tasks_holdout) = 0;
> > > +		rcu_read_unlock();
> > > +	}
> > > +#endif /* #ifdef CONFIG_NO_HZ_FULL */
> > > +}
> > > +
> > >  /* RCU-tasks kthread that detects grace periods and invokes callbacks. */
> > >  static int __noreturn rcu_tasks_kthread(void *arg)
> > >  {
> > > @@ -584,6 +606,7 @@ static int __noreturn rcu_tasks_kthread(void *arg)
> > >  				lastreport = jiffies;
> > >  			firstreport = true;
> > >  			WARN_ON(signal_pending(current));
> > > +			check_no_hz_full_tasks();
> > >  			rcu_read_lock();
> > >  			list_for_each_entry_rcu(t, &rcu_tasks_holdouts,
> > >  						rcu_tasks_holdout_list)
> > > 
> > > .
> > > 
> > 


  reply	other threads:[~2014-08-07  6:45 UTC|newest]

Thread overview: 122+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-31 21:54 [PATCH v3 tip/core/rcu 0/9 Paul E. McKenney
2014-07-31 21:55 ` [PATCH v3 tip/core/rcu 1/9] rcu: Add call_rcu_tasks() Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 2/9] rcu: Provide cond_resched_rcu_qs() to force quiescent states in long loops Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 3/9] rcu: Add synchronous grace-period waiting for RCU-tasks Paul E. McKenney
2014-08-01 15:09     ` Oleg Nesterov
2014-08-01 18:32       ` Paul E. McKenney
2014-08-01 19:44         ` Paul E. McKenney
2014-08-02 14:47           ` Oleg Nesterov
2014-08-02 22:58             ` Paul E. McKenney
2014-08-06  0:57               ` Steven Rostedt
2014-08-06  1:21                 ` Paul E. McKenney
2014-08-06  8:47                   ` Peter Zijlstra
2014-08-06 12:09                     ` Paul E. McKenney
2014-08-06 16:30                       ` Peter Zijlstra
2014-08-06 22:45                         ` Paul E. McKenney
2014-08-07  8:45                           ` Peter Zijlstra
2014-08-07 15:00                             ` Paul E. McKenney
2014-08-07 15:26                               ` Peter Zijlstra
2014-08-07 17:27                                 ` Peter Zijlstra
2014-08-07 18:46                                   ` Peter Zijlstra
2014-08-07 19:49                                     ` Steven Rostedt
2014-08-07 19:53                                       ` Steven Rostedt
2014-08-07 20:08                                         ` Peter Zijlstra
2014-08-07 21:18                                           ` Steven Rostedt
2014-08-08  6:40                                             ` Peter Zijlstra
2014-08-08 14:12                                               ` Steven Rostedt
2014-08-08 14:28                                                 ` Paul E. McKenney
2014-08-09 10:56                                                   ` Masami Hiramatsu
2014-08-08 14:34                                                 ` Peter Zijlstra
2014-08-08 14:58                                                   ` Steven Rostedt
2014-08-08 15:16                                                     ` Peter Zijlstra
2014-08-08 15:39                                                       ` Steven Rostedt
2014-08-08 16:01                                                         ` Peter Zijlstra
2014-08-08 16:10                                                           ` Steven Rostedt
2014-08-08 16:17                                                         ` Peter Zijlstra
2014-08-08 16:40                                                           ` Steven Rostedt
2014-08-08 16:52                                                             ` Peter Zijlstra
2014-08-08 16:27                                                     ` Peter Zijlstra
2014-08-08 16:39                                                       ` Paul E. McKenney
2014-08-08 16:49                                                         ` Steven Rostedt
2014-08-08 16:51                                                         ` Peter Zijlstra
2014-08-08 17:09                                                           ` Paul E. McKenney
2014-08-08 16:43                                                       ` Steven Rostedt
2014-08-08 16:50                                                         ` Peter Zijlstra
2014-08-08 17:27                                                       ` Steven Rostedt
2014-08-09 10:36                                                         ` Masami Hiramatsu
2014-08-07 20:06                                       ` Peter Zijlstra
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 4/9] rcu: Export RCU-tasks APIs to GPL modules Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 5/9] rcutorture: Add torture tests for RCU-tasks Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 6/9] rcutorture: Add RCU-tasks test cases Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 7/9] rcu: Add stall-warning checks for RCU-tasks Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 8/9] rcu: Improve RCU-tasks energy efficiency Paul E. McKenney
2014-07-31 21:55   ` [PATCH v3 tip/core/rcu 9/9] documentation: Add verbiage on RCU-tasks stall warning messages Paul E. McKenney
2014-07-31 23:57   ` [PATCH v3 tip/core/rcu 1/9] rcu: Add call_rcu_tasks() Frederic Weisbecker
2014-08-01  2:04     ` Paul E. McKenney
2014-08-01 15:06       ` Frederic Weisbecker
2014-08-01  1:15   ` Lai Jiangshan
2014-08-01  1:59     ` Paul E. McKenney
2014-08-01  1:31   ` Lai Jiangshan
2014-08-01  2:11     ` Paul E. McKenney
2014-08-01 14:11   ` Oleg Nesterov
2014-08-01 18:28     ` Paul E. McKenney
2014-08-01 18:40       ` Oleg Nesterov
2014-08-02 23:00         ` Paul E. McKenney
2014-08-03 12:57           ` Oleg Nesterov
2014-08-03 22:03             ` Paul E. McKenney
2014-08-04 13:29               ` Oleg Nesterov
2014-08-04 13:48                 ` Paul E. McKenney
2014-08-01 18:57   ` Oleg Nesterov
2014-08-02 22:50     ` Paul E. McKenney
2014-08-02 14:56   ` Oleg Nesterov
2014-08-02 22:57     ` Paul E. McKenney
2014-08-03 13:33       ` Oleg Nesterov
2014-08-03 22:05         ` Paul E. McKenney
2014-08-04  0:37           ` Lai Jiangshan
2014-08-04  1:09             ` Paul E. McKenney
2014-08-04 13:25               ` Oleg Nesterov
2014-08-04 13:51                 ` Paul E. McKenney
2014-08-04 13:52                   ` Paul E. McKenney
2014-08-04 13:32           ` Oleg Nesterov
2014-08-04 19:28             ` Paul E. McKenney
2014-08-04 19:32               ` Oleg Nesterov
2014-08-04  1:28   ` Lai Jiangshan
2014-08-04  7:46     ` Peter Zijlstra
2014-08-04  8:18       ` Lai Jiangshan
2014-08-04 11:50         ` Paul E. McKenney
2014-08-04 12:25           ` Peter Zijlstra
2014-08-04 12:37             ` Paul E. McKenney
2014-08-04 14:56             ` Peter Zijlstra
2014-08-05  0:47               ` Lai Jiangshan
2014-08-05 21:55                 ` Paul E. McKenney
2014-08-06  0:27                   ` Lai Jiangshan
2014-08-06  0:48                     ` Paul E. McKenney
2014-08-06  0:33                   ` Lai Jiangshan
2014-08-06  0:51                     ` Paul E. McKenney
2014-08-06 22:48                       ` Paul E. McKenney [this message]
2014-08-07  8:49                   ` Peter Zijlstra
2014-08-07 15:43                     ` Paul E. McKenney
2014-08-07 16:32                       ` Peter Zijlstra
2014-08-07 17:48                         ` Paul E. McKenney
2014-08-08 19:13   ` Peter Zijlstra
2014-08-08 20:58     ` Paul E. McKenney
2014-08-09  6:15       ` Peter Zijlstra
2014-08-09 12:44         ` Steven Rostedt
2014-08-09 16:05           ` Paul E. McKenney
2014-08-09 16:01         ` Paul E. McKenney
2014-08-09 18:19           ` Peter Zijlstra
2014-08-09 18:24             ` Peter Zijlstra
2014-08-10  1:29               ` Paul E. McKenney
2014-08-10  8:14                 ` Peter Zijlstra
2014-08-11  3:30                   ` Paul E. McKenney
2014-08-11 11:57                     ` Peter Zijlstra
2014-08-11 16:15                       ` Paul E. McKenney
2014-08-10  1:26             ` Paul E. McKenney
2014-08-10  8:12               ` Peter Zijlstra
2014-08-10 16:46                 ` Peter Zijlstra
2014-08-11  3:28                   ` Paul E. McKenney
2014-08-11  3:23                 ` Paul E. McKenney
2014-08-09 18:33       ` Peter Zijlstra
2014-08-10  1:38         ` Paul E. McKenney
2014-08-10 15:00           ` Peter Zijlstra
2014-08-11  3:37             ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140806224832.GA30085@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bobby.prani@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dipankar@in.ibm.com \
    --cc=dvhart@linux.intel.com \
    --cc=edumazet@google.com \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.