linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: mathieu.desnoyers@efficios.com, dhowells@redhat.com,
	loic.minier@linaro.org, dhaval.giani@gmail.com,
	tglx@linutronix.de, peterz@infradead.org,
	linux-kernel@vger.kernel.org, josh@joshtriplett.org
Subject: Re: dyntick-hpc and RCU
Date: Fri, 5 Nov 2010 06:27:46 +0100	[thread overview]
Message-ID: <20101105052740.GB6698@nowhere> (raw)
In-Reply-To: <20101104232148.GA28037@linux.vnet.ibm.com>

On Thu, Nov 04, 2010 at 04:21:48PM -0700, Paul E. McKenney wrote:
> Hello!
> 
> Just wanted some written record of our discussion this Wednesday.
> I don't have an email address for Jim Houston, and I am not sure I have
> all of the attendees, but here goes anyway.  Please don't hesitate to
> reply with any corrections!



Thanks a lot for doing this. I was about to send you an email
to get such a summarize. Especially for the 5th proposition that
was actually not clear to me.




> 
> The goal is to be able to turn of scheduling-clock interrupts for
> long-running user-mode execution when there is but one runnable task
> on a given CPU, but while still allowing RCU to function correctly.
> In particular, we need to minimize (or better, eliminate) any source
> of interruption to such a CPU.  We discussed these approaches, along
> with their advantages and disadvantages:
> 
> 1.	If a user task is executing in dyntick-hpc mode, inform RCU
> 	of all kernel/user transitions, calling rcu_enter_nohz()
> 	on each transition to user-mode execution and calling
> 	rcu_exit_nohz() on each transition to kernel-mode execution.
> 
> 	+	Transitions due to interrupts and NMIs are already
> 		handled by the existing dyntick-idle code.
> 
> 	+	RCU works without changes.
> 
> 	-	-Every- exception path must be located and instrumented.


Yeah, that's bad.



> 
> 	-	Every system call must be instrumented.




Not really, we just need to enter into the syscall slow path mode (which
is still a "-" point, but at least we don't need to inspect every syscalls).



> 
> 	-	The system-call return fastpath is disabled by this
> 		approach, increasing the overhead of system calls.


Yep.



> 
> 	--	The scheduling-clock timer must be restarted on each
> 		transition to kernel-mode execution.  This is thought
> 		to be difficult on some of the exception code paths,
> 		and has high overhead regardless.



Right.



> 
> 2.	Like #1 above, but instead of starting up the scheduling-clock
> 	timer on the CPU transitioning into the kernel, instead wake
> 	up a kthread that IPIs this CPU.  This has roughly the same
> 	advantages and disadvantages as #1 above, but substitutes
> 	a less-ugly kthread-wakeup operation in place of starting
> 	the scheduling-clock timer.
> 
> 	There are a number of variations on this approach, but the
> 	rest of them are infeasible due to the fact that irq-disable
> 	and preempt-disable code sections are implicit read-side
> 	critical sections for RCU-sched.




Yep, that approach is a bit better than 1.




> 3.	Substitute an RCU implementation similar to Jim Houston's
> 	real-time RCU implementation used by Concurrent.  (Jim posted
> 	this in 2004: http://lkml.org/lkml/2004/8/30/87 against
> 	2.6.1.1-mm4.)  In this implementation, the RCU grace periods
> 	are driven out of rcu_read_unlock(), so that there is no
> 	dependency on the scheduler-clock interrupt.
> 
> 	+	Allows dyntick-hpc to simply require this alternative
> 		RCU implementation, without the need to interact
> 		with it.
> 
> 	0	This implementation disables preemption across
> 		RCU read-side critical sections, which might be
> 		unacceptable for some users.  Or it might be OK,
> 		we were unable to determine this.



(Probably because of my misunderstanding of the question at that time)

Requiring a preemption disabled style rcu read side critical section
is probably not acceptable for our goals. This cpu isolation thing
is targeted for HPC purpose (in which case I suspect it's perfectly
fine to have preemption disabled in rcu_read_lock()) but also for real
time purposes (in which case we need rcu_read_lock() to be preemptable).

So this is rather a drawback.




> 
> 	0	This implementation increases the overhead of
> 		rcu_read_lock() and rcu_read_unlock().  However,
> 		this is probably acceptable, especially given that
> 		the workloads in question execute almost entirely
> 		in user space.



This overhead might need to be measured, if it's actually measurable),
but yeah.



> 
> 	---	Implicit RCU-sched and RCU-bh read-side critical
> 		sections would need to be explicitly marked with
> 		rcu_read_lock_sched() and rcu_read_lock_bh(),
> 		respectively.  Implicit critical sections include
> 		disabled preemption, disabled interrupts, hardirq
> 		handlers, and NMI handlers.  This change would
> 		require a large, intrusive, high-regression-risk patch.
> 		In addition, the hardirq-handler portion has been proposed
> 		and rejected in the past.



Now an alternative is to find who is really concerned by this
by looking at the users of rcu_dereference_sched() and
rcu_derefence_bh() (there are very few), and then convert them to use
rcu_read_lock(), and then get rid of the sched and bh rcu flavours.
Not sure we want that though. But it's just to notice that removing
the call to rcu_bh_qs() after each softirq handler or rcu_check_callbacks()
from the timer could somehow cancel the overhead from the rcu_read_unlock()
calls.

OTOH, on traditional rcu configs, this requires the overhead of calling
rcu_read_lock() in sched/bh critical section that usually would have relied
on the implicit grace period.

I guess this is probably a loss in the final picture.

Yet another solution is to require users of bh and sched rcu flavours to
call a specific rcu_read_lock_sched()/bh, or something similar, that would
be only implemented in this new rcu config. We would only need to touch the
existing users and the future ones instead of adding an explicit call
to every implicit paths.



> 
> 4.	Substitute an RCU implementation based on one of the
> 	user-level RCU implementations.  This has roughly the same
> 	advantages and disadvantages as does #3 above.
> 
> 5.	Don't tell RCU about dyntick-hpc mode, but instead make RCU
> 	push processing through via some processor that is kept out
> 	of dyntick-hpc mode.



I don't understand what you mean.
Do you mean that dyntick-hpc cpu would enqueue rcu callbacks to
another CPU? But how does that protect rcu critical sections
in our dyntick-hpc CPU?




>       This requires that the rcutree RCU
> 	priority boosting be pushed further along so that RCU grace period
> 	and callback processing is done in kthread context, permitting
> 	remote forcing of grace periods.



I should have a look at the rcu priority boosting to understand what you
mean here.



>       The RCU_JIFFIES_TILL_FORCE_QS
> 	macro is promoted to a config variable, retaining its value
> 	of 3 in absence of dyntick-hpc, but getting value of HZ
> 	(or thereabouts) for dyntick-hpc builds.  In dyntick-hpc
> 	builds, force_quiescent_state() would push grace periods
> 	for CPUs lacking a scheduling-clock interrupt.
> 
> 	+	Relatively small changes to RCU, some of which is
> 		coming with RCU priority boosting anyway.
> 
> 	+	No need to inform RCU of user/kernel transitions.
> 
> 	+	No need to turn scheduling-clock interrupts on
> 		at each user/kernel transition.
> 
> 	-	Some IPIs to dyntick-hpc CPUs remain, but these
> 		are down in the every-second-or-so frequency,
> 		so hopefully are not a real problem.


Hmm, I hope we could avoid that, ideally the task in userspace shouldn't be
interrupted at all.

I wonder if we shouldn't go back to #3 eventually.



> 
> 6.	Your idea here!
> 
> The general consensus at the end of the meeting was that #5 was most
> likely to work out the best.


At that time yeah.

But now I don't know, I really need to dig deeper into it and really
understand how #5 works before picking that orientation :)

For now #3 seems to me more viable (with one of the adds I proposed).



> 							Thanx, Paul
> 
> PS.  If anyone knows Jim Houston's email address, please feel free
>      to forward to him.


I'll try to find him tomorrow and ask him his mail address :)

Thanks a lot!


  reply	other threads:[~2010-11-05  5:27 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-04 23:21 dyntick-hpc and RCU Paul E. McKenney
2010-11-05  5:27 ` Frederic Weisbecker [this message]
2010-11-05  5:38   ` Frederic Weisbecker
2010-11-05 15:06     ` Paul E. McKenney
2010-11-05 20:06       ` Dhaval Giani
2010-11-05 15:04   ` Paul E. McKenney
2010-11-08 14:10     ` Frederic Weisbecker
2010-11-05 21:00 ` [PATCH] a local-timer-free version of RCU Joe Korty
2010-11-06 19:28   ` Paul E. McKenney
2010-11-06 19:34     ` Mathieu Desnoyers
2010-11-06 19:42       ` Mathieu Desnoyers
2010-11-06 19:44         ` Paul E. McKenney
2010-11-08  2:11     ` Udo A. Steinberg
2010-11-08  2:19       ` Udo A. Steinberg
2010-11-08  2:54         ` Paul E. McKenney
2010-11-08 15:32           ` Frederic Weisbecker
2010-11-08 19:38             ` Paul E. McKenney
2010-11-08 20:40               ` Frederic Weisbecker
2010-11-10 18:08                 ` Paul E. McKenney
2010-11-08 15:06     ` Frederic Weisbecker
2010-11-08 15:18       ` Joe Korty
2010-11-08 19:50         ` Paul E. McKenney
2010-11-08 19:49       ` Paul E. McKenney
2010-11-08 20:51         ` Frederic Weisbecker
2010-11-06 20:03   ` Mathieu Desnoyers
2010-11-09  9:22   ` Lai Jiangshan
2010-11-10 15:54     ` Frederic Weisbecker
2010-11-10 17:31       ` Peter Zijlstra
2010-11-10 17:45         ` Frederic Weisbecker
2010-11-11  4:19         ` Paul E. McKenney
2010-11-13 22:30           ` Frederic Weisbecker
2010-11-16  1:28             ` Paul E. McKenney
2010-11-16 13:52               ` Frederic Weisbecker
2010-11-16 15:51                 ` Paul E. McKenney
2010-11-17  0:52                   ` Frederic Weisbecker
2010-11-17  1:25                     ` Paul E. McKenney
2011-03-07 20:31                     ` [PATCH] An RCU for SMP with a single CPU garbage collector Joe Korty
     [not found]                       ` <20110307210157.GG3104@linux.vnet.ibm.com>
2011-03-07 21:16                         ` Joe Korty
2011-03-07 21:33                           ` Joe Korty
2011-03-07 22:51                           ` Joe Korty
2011-03-08  9:07                             ` Paul E. McKenney
2011-03-08 15:57                               ` Joe Korty
2011-03-08 22:53                                 ` Joe Korty
2011-03-10  0:30                                   ` Paul E. McKenney
2011-03-10  0:28                                 ` Paul E. McKenney
2011-03-09 22:29                           ` Frederic Weisbecker
2011-03-09 22:15                       ` [PATCH 2/4] jrcu: tap rcu_read_unlock Joe Korty
2011-03-10  0:34                         ` Paul E. McKenney
2011-03-10 19:50                           ` JRCU Theory of Operation Joe Korty
2011-03-12 14:36                             ` Paul E. McKenney
2011-03-13  0:43                               ` Joe Korty
2011-03-13  5:56                                 ` Paul E. McKenney
2011-03-13 23:53                                   ` Joe Korty
2011-03-14  0:50                                     ` Paul E. McKenney
2011-03-14  0:55                                       ` Josh Triplett
2011-03-09 22:16                       ` [PATCH 3/4] jrcu: tap might_resched() Joe Korty
2011-03-09 22:17                       ` [PATCH 4/4] jrcu: add new stat to /sys/kernel/debug/rcu/rcudata Joe Korty
2011-03-09 22:19                       ` [PATCH 1/4] jrcu: remove preempt_enable() tap [resend] Joe Korty
2011-03-12 14:36                       ` [PATCH] An RCU for SMP with a single CPU garbage collector Paul E. McKenney
2011-03-13  1:25                         ` Joe Korty
2011-03-13  6:09                           ` Paul E. McKenney
     [not found] <1103753684.861128.1289015433137.JavaMail.root@sz0076a.westchester.pa.mail.comcast.net>
2010-11-06  4:01 ` dyntick-hpc and RCU houston.jim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101105052740.GB6698@nowhere \
    --to=fweisbec@gmail.com \
    --cc=dhaval.giani@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=loic.minier@linaro.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).